1
$\begingroup$

For a n-dimensional vector $\mathbf{x}$, a $n\times n$ correlation matrix $\mathbf{R}$ is https://en.wikipedia.org/wiki/Covariance_matrix#Correlation_matrix

\begin{equation} \mathbf{R} = {E}\big[(\mathbf{x}-E(\mathbf{x}))(\mathbf{x}-E(\mathbf{x}))^T\big]\tag{1a} \end{equation}

where $E(.)$ is expectation operator. If $E(\mathbf{x})=0$, the correlation $\mathbf{R}$ reduces to

\begin{equation} \mathbf{R} = {E}\big[\mathbf{x}^{}\mathbf{x}^T\big]\tag{1b} \end{equation}

The estimate of $\mathbf{R}$, call it $\mathbf{R_{xx}}$, can be computed by collecting $N$ independent n-dimensional sample vectors $\mathbf{x}$ (http://perso-math.univ-mlv.fr/users/banach/workshop2010/talks/Vershynin.pdf)

\begin{equation} \mathbf{R_{xx}} = \frac{1}{(N-1)}\sum_{i=1}^{N} \mathbf{x}_i\mathbf{x}_i^T \tag{2} \end{equation}

My question are

  1. what is the $rank(\mathbf{R})$
  2. what is the $rank(\mathbf{R_{xx}})$ when $N>>n$

From (1b), $rank(\mathbf{R})$ should be 1. For (2), I searched for "rank of sum of rank-1 matrices" and found this post Rank of sum of rank-1 matrices which essentially says that rank of sum of rank-1 matrices as be as high as n for independent vectors. These are two conflicting things and I am not able to understand what I am missing here.

$\endgroup$

2 Answers 2

2
$\begingroup$

$rank(\mathbf{R})$ equals to the number of independent random variables in $\mathbf{x}$. If $\mathbf{R}$ is full rank ($rank(\mathbf{R}) = n$), then it means that all components of $\mathbf{x}$ are linearly independent. If $rank(\mathbf{R}) = k \lt n$, that means there are only $k$ independent random variables in $\mathbf{x}$, the other $n-k$ random variables can be constructed by a linear combination of other components of $\mathbf{x}$.

Your equation (1b) doesn't lead to $rank(\mathbf{R}) = 1$. With certainly conditions (for example, $\mathbf{x}_i$ i.i.d normal), your equation (2) should approach $\mathbf{R}$, and $rank(\mathbf{R_{xx}})$ approaches $rank(\mathbf{R})$.

$\endgroup$
2
  • $\begingroup$ I consider $\mathbf{x}$ as a column vector $\mathbf{x}=[x_1,x_2,\cdots,x_n]^T$. When you say "number of independent random variables in $\mathbf{x}$" you imply $x_i$ being independent? $\endgroup$
    – NAASI
    Commented Jan 11, 2017 at 20:39
  • $\begingroup$ When you wrote down equation (1), it means that $\mathbf{x}$ is a column vector of random variables. Each component $\mathbf{x}_i$ of $\mathbf{x}$ is a single random variable. $\mathbf{R}$ is the correlation matrix of these random variables. If $mathbf{R}$ is not full rank, that means there are linear dependency(ies) among $\mathbf{x}_i$'s. $\endgroup$
    – Guangliang
    Commented Jan 11, 2017 at 20:51
2
$\begingroup$

To answer your question, you need to make assumptions on the statistics of $\mathbf{x}$. So far, you have not said anything about them. In general, unless different random variables $x_n$ in your vector $\mathbf x$ are linearly dependent, $$\mathbf{R} = {\mathbb{E}}\{\mathbf{x}\mathbf{x}^{\rm T}\}$$ will turn out to have full rank. That's because the expectation performs an ensemble average so it's like averaging all possible realizations of $\mathbf x \mathbf x^{\rm T}$. Unless they are somehow dependent, averaging many rank one matrices provides a full rank matrix.

If your stochastic process is additionally stationary and ergodic, you can replace the ensemble average by an average over time, using subsequent realizations $\mathbf x_i$ via $\sum \mathbf x_i \mathbf x_i^{\rm T}$. Under correct statistical assumptions you then have $$\lim_{N \rightarrow \infty} \frac{1}{N} \sum_{i=1}^N \mathbf x_i \mathbf x_i^{\rm T} = \mathbf R.$$ You can then expect that for $N \gg n$, your sample estimate is full rank. Though this is hard to prove rigorously, as you could always be unlucky with the $N$ samples you drew. But this is very unlikely. In fact, it is possible though to bound the probability that your sample covariance matrix is rank deficient and it decreases exponentially with $N$ as soon as $N\geq n$.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .