Interpreting orthogonality

Question

In a multiple partial linear regression setting, the book I'm reading has this sentence:

«As a consequence of the fact that residual are orthogonal to explanatory variables, the 'cleaned' variables $M_2Y$ and $M_2X_1$(which are the residuals) are uncorrelated with $X_2$»,

where $X=(X_1 X_2)$, with $X_2$ being the last g columns of matrix $X$, and $M_2$ is a projection matrix to the orthogonal space of the space generated by the columns of $X_2$.

I don't understand this sentence since $M_2Y$ and $M_2X_1$ are vectors of dim $n\times 1 $, and $X_2$ of dim $n\times g $. I understand the first part of the sentence, which means that $X_2^TM_2X_1=0$ and $X_2^TM_2Y=0$. The second part of the sentence I don't get it.

This might be of some value. Orthogonal variables are perpendicular. 'Uncorrelated' means that the corresponding centered variables are perpendicular. Under certain conditions (such as if the two variables were already centered), the two are the same. — Glen_b, Commented Aug 10, 2014 at 22:51
@Glen_b Thanks for your interest. So, this 'uncorrelatedness' has no relationship to the correlation matrix? — An old man in the sea., Commented Aug 10, 2014 at 23:23
No, it's directly related to the correlation matrix -- "uncorrelated" means "has zero correlation". If you put the pairwise correlations of uncorrelated variables in a correlation matrix, it would have zero entries at the corresponding positions. — Glen_b, Commented Aug 10, 2014 at 23:35
@Glen_b What you're saying is that $X_2^T(M_2X_1-E(M_2X_1))=0$ and $X_2^T(M_2Y-E(M_2Y))=0$? — An old man in the sea., Commented Aug 11, 2014 at 9:02

Alecos Papadopoulos · Accepted Answer · 2017-11-21 15:59:47Z

By how they are constructed, the residuals are orthogonal to the regressors, not only in the statistical sense but also as numerical vectors, see this answer. We are writing the matrices so that they conform, namely $X_2'M_2Y =0$ since $M_2 = I-X_2(X_2'X_2)^{-1}X_2'$

The reason why one finds phrases that appear to equate "orthogonality" with "uncorrelatedness" in econometrics writings, is because usually these are discussed with respect to residuals, or to the error terms. The first have by construction zero mean (as long as the regression includes a constant), the second are assumed to have zero mean. But then, the covariance of these entities with any variable is

$$\operatorname{Cov}(X,u) = E(Xu) - E(X)E(u) = E(Xu) $$

since $E(u)$ is (or is assumed) equal to zero. In such a case, orthogonality becomes equivalent to uncorrelatedness. Otherwise, with both variables having non-zero mean, they are not equivalent.

But this means, that if we examine variables centered on their mean (and so having by construction zero mean), then orthogonality becomes equivalent to non-correlation. Since the practice of thus centering the variables is widely used for various reasons, (outside econometrics also), then again, orthogonality becomes equivalent to non-correlation.

On the contrary, with non-zero means, we have the opposite relation: orthogonality implies correlation.

Assume the variables are orthogonal, $E(XY) =0$. then

$$\operatorname{Cov}(X,Y) = E(XY) - E(X)E(Y) = - E(X)E(Y) \neq 0 $$

So they are correlated.

The above also tells us that we can have $E(XY)\neq 0$, $E(X)\neq 0, E(Y)\neq 0$ , but $\operatorname{Cov}(X,Y) = 0$, if $E(XY) = E(X)E(Y)$. In other words, non-zero-mean independent variables are uncorrelated but not orthogonal.

In all, one should carefully contemplate these concepts and understand under which conditions the one implies the other or the negation of the other.

Many Thanks Alecos. one more doubt. So, when I try to apply your answer to the book citation in the original post, I need to use sample definitions of orthogonality and correlation just like in the paper I posted in my other question? Otherwise how do I apply what you wrote for real valued r.v. to vectors? — An old man in the sea., Commented Aug 15, 2014 at 22:02
If you have a sample of $\{x_i,y_i\}$, then if all $x_i$'s come from the same distribution $X$, and if all $y_i$'s come from a the same distribution $Y$, then sample measures (sample mean, sample covariance etc) are estimates of the theoretical statistical concepts. If you don't have this "identically distributed" feature, then sample measures describe only relationships between the specific vectors of numbers. — Alecos Papadopoulos, Commented Aug 15, 2014 at 22:34

Stack Exchange Network

Interpreting orthogonality

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
regression
multiple-regression
linear
or ask your own question.

Linked

Hot Network Questions

Interpreting orthogonality

1 Answer 1

Not the answer you're looking for? Browse other questions tagged regressionmultiple-regressionlinear or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
regression
multiple-regression
linear
or ask your own question.