Conditional independence of the response variable, regression analysis

Question

The book says: for the standard linear regression model, we assume:

\begin{equation} y_i= \beta_0 + \beta_1 x_i + \epsilon_i \end{equation}

where $E[\epsilon_i]=0$ and $Var[\epsilon_i]=\sigma^2$. Homoskedasticity implies that the errors are independent of the covariates. For constructing confidence interval and hypothesis testing, we assume $\epsilon_i$~$N(0,\sigma^2)$. In this case, the observations of the response variable follow a (conditional) normal distribution with $E[y_i]= \beta_0 +\beta_1 x_i$ ; $Var[y_i]=\sigma^2$ and the $y_i$ are (conditionally) independent given covariate values $x_i$.

Why $y_i$ are conditionally independent from the covariates $x_i$? Linear regression should be about estimating the expected value of $y_i$ conditioned to a sequence of covariates. How $y_i$ is conditionally independent from $x_i$?

First of all, thank you very much for your comment. What do you mean by constant regressors? The book keeps saying that also when it talks about generalized linear models (GLMs): For given covariates $xi= (1,x_{i1},...,x_{ik} )'$, the response variables are (conditionally) independent and the (conditional) density of $y_i$ belongs to an exponential family with: \begin{equation} f(y_i,\theta_i) = exp \left ( \frac{y_i \theta_i - b(\theta_i)}{\phi}w_i + c(y_i, \phi,w_i) \right) \end{equation} — Maximilian, Commented Dec 14, 2022 at 15:20
thank you very much. Does this fact give rise to any further implication? — Maximilian, Commented Dec 14, 2022 at 18:54
I deleted and gathered my comments in an expanded answer below. — Christoph Hanck, Commented Dec 15, 2022 at 8:19

Christoph Hanck · Accepted Answer · 2022-12-15 08:19:24Z

While this is not contained in the part you quote, my guess would be that your book operates under the setup where the regressors are considered to be fixed.

In the setup of this question, that does not change too much. Essentially, you would replace all expectations in your question with expectations conditional on $x_i$.

That said, the idea that the values of the regressors are fixed in repeated samples is generally considered to be somewhat restrictive, as, typically, the regressors are random to the investigator just as the dependent variable is. Exceptions may include experimental setups where the investigator can precisely decide upon, e.g., a dose.

More importantly, however, the fixed regressors idea becomes more restrictive when starting to ask causal questions, think omitted variable bias, instrumental variable approaches etc. These are motivated by correlation of the regressor with the error term, which is not a very natural consideration when the regressors are taken to be fixed constants.

Stack Exchange Network

Conditional independence of the response variable, regression analysis

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
regression
generalized-linear-model
linear-model
or ask your own question.

Hot Network Questions

Conditional independence of the response variable, regression analysis

1 Answer 1

Not the answer you're looking for? Browse other questions tagged regressiongeneralized-linear-modellinear-model or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
regression
generalized-linear-model
linear-model
or ask your own question.