5
$\begingroup$

I am trying to test whether the covariance matrix for the maximum likelihood estimates for a gaussian general linear model approaches the inverse Fisher information matrix (times 1/n , n being the sample size).

However I am not sure how to compare two matrices. One approach would be to compare the eigenvalues. However I am finding with my runs of code, that the eigenvalues of my covariance matrix tend to vary quite wildly. I have tried using the spectral norm (i.e largest eigenvalue, but it is often the case that I have a single very large eigenvalue (say, order unity), and the others are much smaller (of order 10e-15). The Fisher information matrix eigenvalues tend to be quite consistent (about 10e-6, following the example with the same number of samples, observations and covariates for comparison).

So how would one compare two matrices, and relatively easily? Is there a statistical test for this?

My thoughts are that:

-IF we work in a given basis, then the features defining a matrix are:

  1. It's rank

  2. The eigenvectors (and thus the subspace spanned by them)

  3. The eigenvalues

The simplest case for comparison would be if the rank and eigenvectors were the same, so only the eigenvalues had to be compared. Otherwise, depending on what one means be 'how different the matrices are', one would have to have a metric that accounts for the difference in the eigenvalues (perhaps a measure of the overlap of the subspace spanned by them? But this doesn't account for all of the information!), and the eigenvalues. Perhaps also whether the eigenvalues correspond to the same vectors.

So I see that this is not straightforward and probably depends on what you want to do with the information. But in that case, how to meaningfully compare matrices? I suppose in this case of MLE, we really mean element-wise comparison.

$\endgroup$
2
  • $\begingroup$ Have you thought about measuring the dissimilarity between the two matrices using some matrix norm of their difference ($\|A-B\|$)? More details about your problem would be needed to think about a statistical test. For example, how are you obtaining these matrices? Where does any kind of randomness/variability come into play that would let us think about how to define a null distribution? $\endgroup$
    – user20160
    Commented Jul 18, 2019 at 16:57
  • 1
    $\begingroup$ See stats.stackexchange.com/a/469966/919 for one recently proposed approach. $\endgroup$
    – whuber
    Commented Dec 30, 2020 at 14:20

2 Answers 2

1
$\begingroup$

My favorite tool for comparing covariances comes from Förstner W., Moonen B. A Metric for Covariance Matrices. In: Grafarend E.W., Krumm F.W., Schwarze V.S. (eds) Geodesy-The Challenge of the 3rd Millennium. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-05296-9_31

They derive a function, $d$, which measures distance on the space of all real, symmetric, positive definite matrices of any size $(n \times n)$.

$d$'s properties include:

  • $d(A,B) \ge 0$
  • $d(A,B) = 0 \Longleftrightarrow A = B$
  • $d(A,B) = d(B,A) = d(A^{-1},B^{-1})$
  • $d(A,B) ≤ d(A,C) + d(C,B)$ for any $C$ in the same space, called the triangle inequality
  • $d(A,B) = d(XAX^T, XBX^T)$ for any $X$ in $GL(n,\mathbb{R})$, so the value of $d$ is invariant with respect to affine transformations of the coordinate system

The distance $d$ may be calculated as the square root of the sum of the squares of the natural logarithms of the generalized eigenvalues of A and B:

$$d(A,B)=\sqrt{\sum_{i=1}^n\ln^2 \lambda_i(A, B)}$$

The generalized eigenvalue problem is, given matrices $A$ and $B$, find all scalars $λ$ such that $\det(A−λB)=0$. The usual eigenvalue problem is the case $B=I$, the identity matrix. When $B$ is invertible, this equals finding the eigenvalues of the matrix $A$ times $B^{-1}$. Rather, it should, but in numerical computation frequently the answers are not quite the same, and one obtains $d(B,B)=10^{−12}$, rather than the exactly zero it ought to be.)

The paper contains a short survey of other possible comparison computations and why they were rejected, and some incomplete attempts to construct an overly complicated proof for $n=2$, but then switches gears, and finds the answers pretty much just waiting there in Kobayashi & Nomizu, Foundations of Differential Geometry (1963).

$\endgroup$
0
$\begingroup$

The answer depends on where these matrices come from. In latent variable modeling techniques this is commonly done. For example, in Structural Equation Models the loss function for maximum likelihood estimated models is:

$$\text{Fml} = \text{log}|\Sigma(0)| + \text{Trace}[\Sigma(0)^{-1}S] - \text{log}|S| - p$$

Maximizing the Fml minimizes the discrepancy between the estimated population covariance matrix $\Sigma$, and the model implied covariance matrix $\Sigma(0)$. The population covariance is estimated as a function of the observed sample covariance matrix $S$. Model fit statistics provide summaries of the discrepancies between these two matrices. David Kenny provides brief summaries of these statistics but cites the work necessary to break them down further. Many model fit statistics exist which compare these matrices in different ways, you may be able to conform your situation to these.

To answer your question completely, more information would be needed regarding where they come from. However, this literature may give you a good starting point on exploring what options exist.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.