14
$\begingroup$

Let $Z_1,\cdots,Z_n$ be independent standard normal random variables. There are many (lengthy) proofs out there, showing that

$$ \sum_{i=1}^n \left(Z_i - \frac{1}{n}\sum_{j=1}^n Z_j \right)^2 \sim \chi^2_{n-1} $$

Many proofs are quite long and some of them use induction (e.g. Casella Statistical Inference). I am wondering if there is any easy proof of this result.

$\endgroup$
1
  • 1
    $\begingroup$ For an intuitive geometric (coordinate-free) approach, look at Section 1.2 of the excellent text The Coordinate-Free Approach to Linear Models by Michael J. Wichura (the technical details are filled in Theorem 8.2), where the author actually compared the traditional matrix proof (provided by whuber's answer) and his projection approach, showing that his geometric approach is more natural and less obscure. Personally, I think this proof is insightful and succinct. $\endgroup$
    – Zhanxiong
    Commented Dec 17, 2017 at 4:25

3 Answers 3

13
$\begingroup$

For $k=1, 2, \ldots, n-1$, define

$$X_k = (Z_1 + Z_2 + \cdots + Z_k - kZ_{k+1})/\sqrt{k+k^2}.$$

The $X_k$, being linear transformations of multinormally distributed random variables $Z_i$, also have a multinormal distribution. Note that

  1. The variance-covariance matrix of $(X_1, X_2, \ldots, X_{n-1})$ is the $n-1\times n-1$ identity matrix.

  2. $X_1^2 + X_2^2 + \cdots + X_{n-1}^2 = \sum_{i=1}^n (Z_i-\bar Z)^2.$

$(1)$, which is easy to check, directly implies $(2)$ upon observing all the $X_k$ are uncorrelated with $\bar Z.$ The calculations all come down to the fact that $1+1+\cdots+1 - k = 0$, where there are $k$ ones.

Together these show that $\sum_{i=1}^n(Z_i-\bar Z)^2$ has the distribution of the sum of $n-1$ uncorrelated unit-variance Normal variables. By definition, this is the $\chi^2(n-1)$ distribution, QED.

References

  1. For an explanation of where the construction of $X_k$ comes from, see the beginning of my answer at How to perform isometric log-ratio transformation concerning Helmert matrices.

  2. This is a simplification of the general demonstration given in ocram's answer at Why is RSS distributed chi square times n-p. That answer asserts "there exists a matrix" to construct the $X_k$; here, I exhibit such a matrix.

$\endgroup$
8
  • 2
    $\begingroup$ This construction has a simple geometric interpretation. (1) The variables $Z_i$ are distributed on a n-dimensional spherically symmetric distribution (thus we can rotate it any way we like). (2) The $\overline{Z}$ is found as a solution to the linear problem $Z_i = \overline{Z} + \epsilon_i$, which is effectively a projection of the vector $\mathbf{Z}$ onto $\mathbf{1}$. (3) If we rotate the coordinate space such that one of the coordinates coincides with this projection vector, $\mathbf{1}$, then the remainder is a (n-1)-multinomial distribution representing the residual space. $\endgroup$ Commented Nov 13, 2017 at 22:17
  • $\begingroup$ You show that the $X_i$'s are uncorrelated with each other. But as far as I understand, to say that a sum of squared standard normal variables is $\chi^2$, we need independence, which is a much stronger requirement than uncorrelated? EDIT: oh wait, if we know that two variables are normally distributed, then uncorrelated implies independence. $\endgroup$
    – user56834
    Commented Nov 14, 2017 at 6:09
  • $\begingroup$ Also, I don't understand how you go from the fact that the $X_i$'s are uncorrelated with $\bar Z$ (which I do understand), to (2). Could you elaborate? $\endgroup$
    – user56834
    Commented Nov 14, 2017 at 6:14
  • $\begingroup$ @Programmer Sorry; I didn't mean to imply it's a logical deduction--(1) and (2) are two separate observations. (2) is merely a (straightforward) algebraic identity. $\endgroup$
    – whuber
    Commented Nov 14, 2017 at 14:39
  • 2
    $\begingroup$ Programmer, note the link to the other answer that Whuber gave ( stats.stackexchange.com/questions/259208/… ) The $X_k$ are constructed based on a matrix, $H$, with orthogonal rows. So you can evaluate in a more abstract (less fallible) way $\sum K_i^2$ as $K \cdot K = (H Z) \cdot (H Z) = (HZ)^T (HZ) = Z^T (H^TH) Z= Z^T I Z = Z \cdot Z$, (note we have to extend K by the vector 1111 to make it n by n) $\endgroup$ Commented Nov 17, 2017 at 17:16
6
$\begingroup$

Note you say $Z_is$ are iid with standard normal $N(0,1)$, with $\mu=0$ and $\sigma=1$

Then $Z_i^2\sim \chi^2_{(1)}$

Then \begin{align}\sum_{i=1}^n Z_i^2&=\sum_{i=1}^n(Z_i-\bar{Z}+\bar{Z})^2=\sum_{i=1}^n(Z_i-\bar{Z})^2+n\bar{Z}^2\\&=\sum_{i=1}^n(Z_i-\bar{Z})^2+\left[\frac{\sqrt{n}(\bar{Z}-0)}{1}\right ]^2 \tag{1} \end{align}

Note that the left hand side of (1), $$\sum_{i=1}^n Z_i^2\sim\chi^2_{(n)}$$ and that the second term on the right hand side $$\left[\frac{\sqrt{n}(\bar{Z}-0)}{1}\right ]^2 \sim\chi^2_{(1)}.$$

Furthermore $\operatorname{Cov}(Z_i-\bar Z,\bar Z)=0$ such that $Z_i-\bar Z$ and $\bar Z$ are independent. Therefore the two last terms in (1) (functions of $Z_i-\bar Z$ and $Z_i$) are also independent. Their mgfs are therefore related to the mgf of the left hand side of (1) through $$ M_n(t) = M_{n-1}(t)M_1(t) $$ where $M_n(t)=(1-2t)^{-n/2}$ and $M_1(t)=(1-2t)^{-1/2}$. The mgf of $\sum_{i=1}^n(Z_i-\bar{Z})^2$ is therefore $M_{n-1}(t)=M_n(t)/M_1(t)=(1-2t)^{-(n-1)/2}$. Thus, $\sum_{i=1}^n(Z_i-\bar{Z})^2$ is chi-square with $n-1$ degrees of freedom.

$\endgroup$
5
  • 1
    $\begingroup$ The last "Therefore" is too careless $\endgroup$
    – Zhanxiong
    Commented Nov 7, 2017 at 3:09
  • $\begingroup$ The independent can be seen from standard deivation is in dependent of $\bar{X}$ $\endgroup$
    – Deep North
    Commented Nov 7, 2017 at 3:35
  • 2
    $\begingroup$ "standard deviation independent of $\bar{X}$"? Maybe what you wanted to say is $\sum Z_i^2$ is independent of $\bar{Z}$. Unfortunately, this is not true. What really holds is $\sum (Z_i - \bar{Z})^2$ is independent of $\bar{Z}$, which is also a part of the proof we need to complete (instead of using it when showing this proposition). $\endgroup$
    – Zhanxiong
    Commented Nov 7, 2017 at 3:54
  • $\begingroup$ I think I used Cochran’s Theorem $\endgroup$
    – Deep North
    Commented Nov 7, 2017 at 4:26
  • 3
    $\begingroup$ @DeepNorth If filled in some missing pieces in your proof $\endgroup$ Commented Nov 7, 2017 at 16:45
1
$\begingroup$

This classical question brings me back to three different snapshots of my own student life :).

When I was a college junior student, my professor proved this fundamental theorem (I believe this is the first big theorem that I learned in my statistics career -- that's why I remembered this proof so clearly) as follows:

Define the transformation \begin{align*} \mathbf{Y} := \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{n}} & \frac{1}{\sqrt{n}} & \cdots & \frac{1}{\sqrt{n}} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix} \begin{bmatrix} Z_1 \\ Z_2 \\ \vdots \\ Z_n \end{bmatrix} =: O\mathbf{Z}, \tag{1}\label{1} \end{align*} where entries $a_{ij}, 2 \leq i \leq n, 1 \leq j \leq n$ are chosen such that the matrix $O$ becomes an orthogonal matrix. It then follows that \begin{align*} \sum_{i = 1}^n(Z_i - \bar{Z})^2 = \sum_{i = 1}^nZ_i^2 - n\bar{Z}^2 = \sum_{i = 1}^nY_i^2 - Y_1^2 = Y_2^2 + \cdots + Y_n^2. \tag{2}\label{2} \end{align*} The second equality holds because an orthogonal transformation preserves the vector norm and by setting $Y_1 = \sqrt{n}\bar{Z}$. Now the result follows because $Y_2^2 + \cdots + Y_n^2$ is the sum of $n - 1$ squared i.i.d $N(0, 1)$ random variables -- as a result of $\mathbf{Y} \sim N_n(0, I_{(n)})$, which is in turn a result of $\mathbf{Z} \sim N_n(0, I_{(n)})$ by condition.

While this proof is indeed easy to follow and very short (I was amazed as a college student), as I grew up I felt the transformation $\eqref{1}$ is rather contrived and mysterious (professor didn't explain what motivated him to place down transformation $\eqref{1}$). At that time my linear algebra skill also improved, so I came up with the following proof by myself -- it somehow reversed the logical order of $\eqref{1}$ and $\eqref{2}$ and seems more natural (although it did summon some slightly more advanced math weapon):

In matrix form, $\sum_{i = 1}^n(Z_i - \bar{Z})^2 = \mathbf{Z}^\top P\mathbf{Z}$, where $P = I_{(n)} - n^{-1}\mathbf{e}\mathbf{e}^\top$ is a symmetric idempotent matrix with rank $n - 1$. Therefore, there exists an order $n$ orthogonal matrix $O$ such that $P = O^T\operatorname{diag}(I_{(n - 1)}, 0)O$. Denote $O\mathbf{Z}$ by $\mathbf{Y} \sim N_n(0, I_{(n)})$, it follows that \begin{align} \mathbf{Z}^\top P\mathbf{Z} = (O\mathbf{Z})^\top \operatorname{diag}(I_{(n - 1)}, 0)(O\mathbf{Z}) = \mathbf{Y}^\top\operatorname{diag}(I_{(n - 1)}, 0)\mathbf{Y} = \sum_{i = 1}^{n - 1}Y_i^2 \sim \chi^2_{n - 1}. \end{align}

In this second proof, transformation $\eqref{1}$ is rediscovered (in an implicit way) guided by linear algebra theory (the canonical form of a symmetric idempotent matrix). Well, although it may not look as "easy and simple" as the first one, it satisfied me.

A few years later, when I studied the textbook The Coordinate-Free Approach to Linear Models for a PhD-level linear model course, the author's geometric treatment is an eye-opener:

Identify $\sum_{i = 1}^n(Z_i - \bar{Z})^2 = (\mathbf{Z} - \bar{Z}\mathbf{e})^\top(\mathbf{Z} - \bar{Z}\mathbf{e})$ and view $\mathbf{Z}$ as a vector in the inner product space $(\mathbb{R}^n, \langle ., .\rangle)$, where $\langle\mathbf{x}, \mathbf{y}\rangle = \mathbf{x}^\top\mathbf{y}$. Let $M$ denote the 1-dimensional subspace spanned by the vector $\mathbf{e}$, then $\mathbf{Z} - \bar{Z}\mathbf{e} = P_{M^\perp}\mathbf{Z}$, where "$P_S\mathbf{x}$" stands for the orthogonal projection of $\mathbf{x}$ onto subspace $S$, whence $$\sum_{i = 1}^n(Z_i - \bar{Z})^2 = \|P_{M^\perp}\mathbf{Z}\|^2 \sim \chi_{\dim(M^\perp)}^2, $$ which is $\chi^2_{n - 1}$ as $\dim(M^\perp) = n - 1$.

From a rigorous mathematical standpoint, the last "$\sim$" step requires elaboration (which is supplemented in Theorem 8.2 of the same reference) and lengthens the full proof a little bit. Nevertheless, it is the geometric perspective (i.e., treating the sample as a point in an inner product space, introducing the concept of orthogonal projection -- which demystifies transformation $\eqref{1}$ by endowing it a tangible geometric meaning) that is worth learning and useful to tackle more difficult problems (e.g., the proof of Cochran's theorem and inference problems arisen in linear regression models).

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.