How can I get the probability of a predicted outcome conditional on the posterior in bayesian regression?

Question

Assume that I am running the following regression:

$\hat{y_t} = \beta_0 + \beta_1 \cdot x_t$

where as $\hat{y_t}$ is a continuous variable.

Lets assume a gaussian likelihood and nonconjugate priors such as $\beta_0 \sim N(0, 1)$ and $\beta_1 \sim Beta(1,1)$

Let's assume I have a dataset spanning from timestep $0$ to $t$ and call this dataset $h_t$.

Let's assume I obtain the posterior by some numerical method such as MCMC; let's denote these posterior estimates of our parameters as $\theta$.

Now, I am interested in obtaining the probability of observing a SPECIFIC outcome $z$ given some input $x_t$; thus, what I am interested in obtaining is the following:

$p(y_{t+1} = z | x_{t+1}) = \int p(\theta|h_t)p(y_{t+1} = z | \theta, x_{t+1}) d\theta$

I believe that this expression is equal to zero since we have continuous outcomes; this should lead to $p(y_{t+1} = z | \theta, x_{t+1}) = 0$; thus we have to define intervals to properly define this expression as the following, let's assume three intervals, $z<0$, $0<=z<10$ and $10<z$. Then, we can define, for example:

$p(y_{t+1} \in z_{interval_1} | \theta, x_{t+1}) = \int p(\theta|h_t)p(y_{t+1} \in z_{interval} | \theta, x_{t+1}) d\theta$

where $z_{interval_1}$ denotes one of the intervals.

Is this line of reasoning correct?

Maybe there is a typo in the very first equation? I assume $y_t$ is a continuous variable, and you are performing linear regression. If that's the case, then yes, the desired probability is no more different from $P(X = 0)=0$ when $X\sim N(0,1)$. The interval seems to be arbitrary. Anyways, when a Bayesian asks for a predictive probability, he probably refers to the entire predictive posterior distribution. — utobi, Commented Jan 17 at 11:35
Note also that the predictive posterior doesn't depend on the parameter. — utobi, Commented Jan 17 at 11:37
updated the question, yes $y_t$ is assumed to be continous. Precisely, that's exactly what I wanted to know. If we are actually interested in the probability of a specific outcome we have to define intervals in which this outcome can be in since the outcome is continuous, precisely like in introductory statistics. Would have accepted this as an answer. — karl henriksson, Commented Jan 17 at 14:32

utobi · Accepted Answer · 2024-01-18 08:36:45Z

In your notation, given the posterior distribution $p(\theta|h_t)$ and the density of a future observation $p(y_{t+1}|\theta,x_{t+1})$, posterior predictive distribution is $$ p(y_{t+1}|h_t,x_{t+1}) = \int_{\theta\in\Theta}p(\theta|h_t)p(y_{t+1}|\theta,x_{t+1})\,\text{d}\theta. $$ Therefore, the posterior predictive distribution doesn't depend on unknown quantities, such as the parameter $\theta$.

If $Y_t$ are continuous random variables, then as for any continuous random variable $P(Y_{t+1} = a|h_t,x_{t+1})=0$, for any scalar $a$. You could undoubtedly compute the posterior predictive probability that $Y_{t+1}$ is within some interval if that's what you are after.

The Bayesian approach equips you with the entire distribution of a future observation $Y_{t+1}$ conditional on the past data and the prior information. What you can achieve with that depends on the purpose of the analysis. For instance, you could compute the expected value of such a prediction, the variance, or you may compute the 95% prediction interval, etc.

seanv507 · Accepted Answer · 2024-01-17 08:56:35Z

1

so in practise your MCMC parameter samples are fed into your model to generate many realisations of the outcome for your particular $x_t$.see eg stan generated quantities.

Then you are free to use any density estimator to create a smooth density.

Though most use cases work directly with the samples of the outcome... So why do you need the probability density of the outcome?

answered Jan 17 at 8:56

seanv507

7,26823 silver badges32 bronze badges

Add a comment |

Stack Exchange Network

How can I get the probability of a predicted outcome conditional on the posterior in bayesian regression?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
regression
bayesian
mathematical-statistics
multiple-regression
posterior
or ask your own question.

Hot Network Questions

How can I get the probability of a predicted outcome conditional on the posterior in bayesian regression?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged regressionbayesianmathematical-statisticsmultiple-regressionposterior or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
regression
bayesian
mathematical-statistics
multiple-regression
posterior
or ask your own question.