0
$\begingroup$

Matrix regression proof that $\hat \beta = (X' X)^{-1} X' Y = {\hat \beta_0 \choose \hat \beta_1} $

where $\beta$ is the least square estimator of $\hat\beta$ of $\beta$

attempt

So I know ${\hat \beta_0 \choose \hat \beta_1} = {\overline{Y} - \hat \beta_1 \overline{X} \choose \frac{\sum_{i=1}^{n} (X_i - \overline{X})(Y_i - \overline{X})}{\sum_{i=1}^{n}(X_i - \overline{X})^2}}$

Not really sure how to start as I don't know what formulas there are to reduce any of this. And if this was answered elsewhere please duplicate I was trying to search but couldn't

$\endgroup$
2
  • $\begingroup$ See e.g. here: stats.stackexchange.com/questions/46151/… or stats.stackexchange.com/questions/186196/…. $\endgroup$ Commented Jun 30, 2019 at 8:08
  • $\begingroup$ The steps there are basically 1) Recall that the least squares estimator is chosen to minimise (with respect to $\beta$) the function $$S(\beta):= (y-X\beta)^T (y - X\beta);$$ 2) expand this to show that $$S(\beta) = y^T y - 2y^T X \beta + \beta^T X^T X \beta;$$ 3) use matrix calculus to find the $\beta$ that minimises this (calculate $\frac{\partial S}{\partial \beta}$, set to $\mathbf{0}$ and solve for $\beta$). $\endgroup$ Commented Jun 30, 2019 at 8:13

2 Answers 2

1
$\begingroup$

Our goal is to minimize $$ f(\beta) = \frac12 \| X \beta - Y \|^2. $$ Notice that $f = g \circ h$, where $h(\beta) = X \beta - Y$ and $g(u) = \frac12 \| u \|^2$. The derivatives of $g$ and $h$ are given by $$ g'(u) = u^T, \quad h'(\beta) = X. $$ By the chain rule, we have \begin{align} f'(\beta) &= g'(h(\beta)) h'(\beta) \\ &= (X \beta - Y)^T X. \end{align} The gradient of $f$ is $$ \nabla f(\beta) = f'(\beta)^T = X^T( X \beta - Y). $$ Setting the gradient of $f$ equal to $0$, we discover that $$ X^T X \beta = X^T Y. $$

$\endgroup$
0
$\begingroup$

In a slight variant on @MinusOne-Twelfth's comment,$$\frac{\partial}{\partial\beta_i}(y-X\beta)_j=-X_{ji}\implies\frac{\partial}{\partial\beta_i}\sum_j(y-X\beta)_j^2=2\sum_jX_{ij}^T(X\beta-y)_j=2(X^\prime X\beta-X^\prime y)_i.$$Setting this to $0$ for all $i$,$$X^\prime X\beta=X^\prime y\implies\beta=(X^\prime X)^{-1}X^\prime y.$$

$\endgroup$
2
  • 1
    $\begingroup$ I'm confused how finding $\beta$ equals ${\beta_0 \choose \beta_1}$? $\endgroup$
    – bob
    Commented Jun 30, 2019 at 23:04
  • $\begingroup$ @bob The easiest option is to double-check $X^{\prime}X\left(\begin{array}{c} \beta_{0}\\ \beta_{1} \end{array}\right)=X^{\prime}y$. $\endgroup$
    – J.G.
    Commented Jul 1, 2019 at 5:28

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .