2
$\begingroup$

I'm referencing https://arxiv.org/pdf/1509.09169.pdf on ridge regression. On page 34 question 1.5 we need to prove :
Ridge fit $\widehat{Y}(\lambda)=X(X^{\top}X+\lambda I_p)^{-1}X^{\top}Y$ is not orthogonal to ridge residual $Y − \widehat{Y}(\lambda)$.

To how this since I think we can use that the hat matrix for ridge regression is not a projection matrix but that does not give me anything useful. In the OLS case we show that the residual is not orthogonal to $X$ since $\widehat{Y}(\lambda)$ is linear combination of $X$, but I do not think we can use this here as the linear combination property might not hold here due to the term $\lambda I_p$. Please tell how to show this.

$\endgroup$
1
  • $\begingroup$ For some special values of $\lambda$ orthogonality may hold. A useful intuition is the characterization of Ridge Regression in terms of augmenting the data: see stats.stackexchange.com/a/164546/919. If the original model matrix had a column of constants (plus at least one other non-constant column), obviously the augmented matrix cannot have a column of constants. Most of the time, the column space will not include any nonzero constant vectors (but, for a finite set of $\lambda,$ it could). $\endgroup$
    – whuber
    Commented Dec 7, 2020 at 17:40

1 Answer 1

1
$\begingroup$

Set for clarity

$$B \equiv (X^{\top}X+\lambda I_p)^{-1}$$

and you are asked to examine

$$(Y - \widehat{Y})^{\top}XBX^{\top}Y = (Y - XBX^{\top}Y)^{\top}XBX^{\top}Y.$$

Doing the algerba

$$...=Y^{\top}XBX^{\top}Y - Y^{\top}XBX^{\top}XBX^{\top}Y.$$

If $B$ was equal to $X^{\top}X$ the second component would simplify and become equal to the first, hence the zero result. But in Ridge regession this is not the case, so the expression does not equal zero.

Continuing with the manipulations,

$$...=Y^{\top}XBX^{\top}\big[I-XBX^{\top}]Y$$.

If $B$ was equal to $X^{\top}X$, then the term in brackets would become the ("complementary") projection matrix of $X$, and would make the expression zero.It is not, so no zero result.

$\endgroup$
2
  • $\begingroup$ I get from here $ Y^{\top}(XBX^{\top} - (XBX^{\top})^{\top}(XBX^{\top}))Y $. I have shown that $ XBX^{\top} $ is not a projection matrix. Thus if I equate this to zero and then removing $Y$ and $Y^{\top}$ by pre and post multiplying by their transpose, I can say that since the bracket term is not zero, these can not be orthogonal. Is my reasoning correct? $\endgroup$
    – Vks
    Commented Dec 7, 2020 at 6:19
  • $\begingroup$ Thank you for your answer. It is clear to me. Though I wanted to ask if how I did (in the comment) is also right? $\endgroup$
    – Vks
    Commented Dec 10, 2020 at 7:56

Not the answer you're looking for? Browse other questions tagged or ask your own question.