0
$\begingroup$

I have to compute $v^TH^{-1}v$ where $H$ is a very big matrix (which is a Hessian, thus symmetric and positive-definite). It occurred to me that I can do an approximation by assuming $v$ is an eigenvector of $H$ and then $v^TH^{-1}v\approx\frac{|v|^4}{v^THv}$, bypassing the matrix inversion. This uses the fact that if $v$ has eigenvalue $\lambda$ for $H$ it has eigenvalue $1/\lambda$ for $H^{-1}$, then: $v^TH^{-1}v=\frac{|v|^2}{\lambda}=\frac{|v|^2}{(v/|v|)^TH(v/|v|)}=\frac{|v|^4}{v^THv}$.

Is this approximation reasonable? If so, does it have a name? When is it good/bad compared to assuming a diagonal H?

$\endgroup$
1
  • 1
    $\begingroup$ You might estimate the error by considering $v\approx u+\Delta$ with $u$ being the closest eigenvector. $\endgroup$
    – user619894
    Commented Jun 22, 2022 at 5:53

2 Answers 2

1
$\begingroup$

You can write $v=\sum_{i} a_i v_i$ where $v_i$ are eigenvector of $H$ (and in particular orthonormal) with corresponding eigenvalue $\lambda_i$. It is known that $Hv_i=\lambda_i v_i$ and $H^{-1}v_i= H^{-1} \frac{H v_i}{\lambda_i}=\frac 1 {\lambda_i} H^{-1}H v_i=\frac{1}{\lambda_i} v_i$. We also know that $v_i^T v_j=1(i=j)$. Putting all that thogether \begin{align*} v^T H^{-1} v &=\sum_j\sum_i a_i a_j v_j^TH^{-1} v_i\\ &=\sum_j\sum_i \frac{a_i a_j}{\lambda_i} v_j^T v_i\\ &=\sum_i \frac{a_i^2}{\lambda_i}\\ \end{align*} which may help you do your estimation. In particular if you keep only eigen vector, you will keep exactly one term of this sum, you may want to keep the one for which $\frac{a_i^2}{\lambda_i}$ is the biggest (this is in the spirit of what @user619894 is suggesting but puts a value on the notion of being "close"). The more terms you include the better it gets.

$\endgroup$
4
  • $\begingroup$ Thanks for your response! I think this formula alone is sufficient when I approximate things with one of the eigenvectors, but I don't have access to them. Instead, the approximation comes from assuming that $v$ is an eigenvector, which it usually isn't. Then, to do error analysis, wouldn't I also need some decomposition of $|v|^4/(v^THv)$ as a function of $a_i,\lambda_i$? Thanks! $\endgroup$
    – etal
    Commented Jun 22, 2022 at 18:21
  • $\begingroup$ Ok, I derived the rest :) Would you mind adding this to your answer and then I'll accept it? Let $Exact=\sum_i a_i^2/\lambda_i$. Let $Approx=\frac{|v|^4}{v^THv} = \frac{(\sum a_i^2)^2}{\sum \lambda_ia_i^2}$. Then $\frac{Exact}{Approx} = \frac{(\sum_i a_i^2/\lambda_i)(\sum_i \lambda_ia_i^2)}{(\sum_i a_i^2)^2} = \frac{\sum_i a_i^2/\lambda_i}{\sum a_i^2}\frac{\sum_i \lambda_ia_i^2}{\sum_i a_i^2} = \frac{\text{Weighted average of }\lambda_i \text{ by } a_i^2}{\text{Weighted harmonic average of }\lambda_i\text{ by }a_i^2}$. This shows it's always an underestimate and is better for close $\lambda_i$ $\endgroup$
    – etal
    Commented Jun 22, 2022 at 19:07
  • $\begingroup$ Your analysis is correct, but I don't think this represent a definitive answer to your question, if you don't mind I prefer to leave opportunity for a better answer than yours and mine (even though it will most likely never come). $\endgroup$
    – P. Quinton
    Commented Jun 23, 2022 at 6:51
  • $\begingroup$ Ok, no worries. I will write the answer myself citing yours then, I do think it is close to the answer I was looking for (is it reasonable? yes, in the sense it is well-correlated with the true answer and it never over-estimates. When is it poor? when the vector has components along directions of widely different eigenvalues, as defined by the difference between AM and HM, which is well understood). I will still not accept it in case something else comes, particularly on the comparison to assuming diagonal Hessian. $\endgroup$
    – etal
    Commented Jun 24, 2022 at 0:07
1
$\begingroup$

[Building on a comment by @user619894 and especially on @P.Quinton's answer].

TLDR: it is a well-defined estimate in the sense that it correlates with the true answer in an understandable way. It also never over-estimates the result. Its error will be big when the vector has significant components in directions of widely different eigenvalues.

Derivation Because $H$ is symmetric and positive definite we know that it decomposes in an orthonormal basis $v_i$ with eigenvalues $\lambda_i>0$; i.e. $Hv_i=\lambda_i,H^{-1}v_i=\frac{1}{\lambda_i}, v_i^Tv_j=1(i=j)$.

Let's express $v=\sum_ia_iv_i$. Then the true answer is: $$\text{Answer} = v^THv = \sum_{i,j}a_ia_jv_j^TH^{-1}v_i = \sum_{i,j} \frac{a_ia_j}{\lambda_i} v_j^Tv_i = \sum_i \frac{a_i^2}{\lambda_i}.$$ Of course, we don't know the true answer because we don't know $\lambda_i$nor $a_i$. But we can look at the estimate as a function of both and see how they differ:

$$\text{Estimate} = \frac{|v|^4}{v^THv} = \frac{\left(\sum_ia_i^2\right)^2}{\sum_ia_i^2\lambda_i}.$$

Now we can look at the ratio: $$\frac{\text{Estimate}}{\text{Answer}} = \frac{\left(\sum_ia_i^2\right)^2}{\left(\sum_ia_i^2\lambda_i\right)\left(\sum_i \frac{a_i^2}{\lambda_i}\right)} = \frac{\sum_i a_i^2}{\sum_ia_i^2\lambda_i}\cdot\frac{\sum_i a_i^2}{\sum_i a_i^2\frac{1}{\lambda_i}} = \frac{\text{ Harmonic average of $\lambda_i$ weighted by }a_i^2}{\text{Arithmetic average of $\lambda_i$ weighted by }a_i^2}.$$

It is well-known that for $\lambda_i>0$ the harmonic mean is always less than or equal to the arithmetic mean. Thefore, the estimate will always be an under-estimate. Furthermore, the approximation will be worse the more diverse the $\lambda_i$ are (with significant $a_i$ component for $v$). Intuitively, the inverse cares more about the smallest eigenvalues.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .