44
$\begingroup$

The Frobenius norm of a $m \times n$ matrix $F$ is defined as

$$\| F \|_F^2 := \sum_{i=1}^m\sum_{j=1}^n |f_{i,j}|^2$$

If I have $FG$, where $G$ is a $n \times p$ matrix, can we say the following?

$$\| F G \|_F^2 = \|F\|_F^2 \|G\|_F^2$$

Also, what does Frobenius norm mean? Is it analogous to the magnitude of a vector, but for matrix?

$\endgroup$
4
  • 4
    $\begingroup$ in general, the Frobenius norm need not be multiplicative. it is just sub multiplicative. Think about Cauchy Schwarz inequality. $\endgroup$
    – user251257
    Commented Aug 11, 2015 at 15:49
  • $\begingroup$ concerning the "meaning" of the norm, the set of matrices is a vector space which means up to a choice of notation it's just $\mathbb{R}^p$ for $p=n^2$. This Frobenius norm is just the natural length of the vector which is formed by stringing out the entries of the matrix into an $n^2$-vector. Furthermore, the norm makes the set of matrices a normed linear space which means you get all the excellent theorems which go with that structure. In particular, a nice theory of power series etc. $\endgroup$ Commented Aug 12, 2015 at 1:57
  • $\begingroup$ If we had equality in all cases then we'd have a normed algebra, but, that is too greedy except in a few special cases like the matrices which correspond to the reals, complex or quaternions. In other cases, the existence of zero-divisors in the algebra necessarily either spoils multiplicativity of the norm or it gives a multiplicative "norm" which isn't really a norm. $\endgroup$ Commented Aug 12, 2015 at 2:01
  • $\begingroup$ Thanks for the nice answers! Is it possible to calculate a correction factor to recover the equality? $\endgroup$
    – shouldsee
    Commented Apr 18, 2022 at 13:42

4 Answers 4

55
$\begingroup$

Actually there is $$||FG||^2_F \leqslant||F||^2_F||G||^2_F$$ The proof is as follows. \begin{align} \|FG\|^2_F&=\sum\limits_{i=1}^{m}\sum\limits_{j=1}^{p}\left|\sum\limits_{k=1}^nf_{ik}g_{kj}\right|^2 \\ &\leqslant\sum\limits_{i=1}^{m}\sum\limits_{j=1}^{p}\left(\sum\limits_{k=1}^n|f_{ik}|^2\sum\limits_{k=1}^n|g_{kj}|^2\right)\tag{Cauchy-Schwarz} \\ &=\sum\limits_{i=1}^{m}\sum\limits_{j=1}^{p}\left(\sum\limits_{k,l=1}^n|f_{ik}|^2|g_{lj}|^2\right) \\ &=\sum\limits_{i=1}^{m}\sum\limits_{k=1}^{n}|f_{ik}|^2\sum\limits_{l=1}^{n}\sum\limits_{j=1}^{p}|g_{lj}|^2 \\ &=\|F\|^2_F\|G\|^2_F \end{align} Frobenius norm is like vector norm and similar to $l_2$.

$\endgroup$
6
  • $\begingroup$ I wonder, in general, for which matrices is equality attained. $\endgroup$ Commented Aug 12, 2015 at 0:55
  • 2
    $\begingroup$ When Cauchy-Schwarz becomes equal. $\endgroup$ Commented Aug 12, 2015 at 1:01
  • $\begingroup$ Yes, how can we otherwise characterize such matrices... (your answer is appropriate to the question asked I think as the OP wants a general identity). $\endgroup$ Commented Aug 12, 2015 at 1:25
  • $\begingroup$ I could ask a stand-alone question so this one does not get further cluttered. $\endgroup$ Commented Aug 12, 2015 at 1:50
  • 6
    $\begingroup$ The equality discussion of Cauchy-Schwarz yields that we have equality if and only if $F = x \otimes y^t$, $G = y \otimes z^t$ for vectors $x, y, z$. In particular, $F$ and $G$ have rank one. $\endgroup$
    – Kofi
    Commented Sep 4, 2019 at 23:20
36
$\begingroup$

Let $A$ be $m \times r$ and $B$ be $r \times n$. A better bound here is $$ \| A B \|_F \le \|A\| \|B\|_F \quad (*) $$ where $\|A\|$ is the $\ell_2$ operator norm: $$ \|A\| = \max_{\|x\|_2\, \le\, 1} \|A x\|_2. $$ It is also equal to the largest singular value of A. From this definition $\|A x\|_2 \le \|A\| \|x\|_2$ for any vector $x$ in $\mathbb R^r.$

Since $\|A\| \le \|A\|_F$, inequality (*) is a strictly better inequality than the sub-multiplicative inequality for the Frobenius norm.

To see the inequality, let $B = [b_1 \mid b_2 \mid \cdots \mid b_n]$ be the column decomposition of $B$. Then, $A B = [Ab_1 \mid A b_2 \mid \dots \mid Ab_n]$ is the column decomposition of $AB$. It follows that \begin{align*} \| A B \|_F^2 = \sum_{j=1}^n \|A b_j\|^2 \le \|A\|^2 \sum_j \|b_j\|^2 = \|A\|^2 \|B\|_F^2. \end{align*}


EDIT in response to the question in the comments, ``Is there a lower bound for the Frobenius norm of the product of two matrices?''. In general, no, except for the obvious lower bound of zero. Consider the following two matrices \begin{align} A = \begin{pmatrix} a & b \\ 0 & 0 \end{pmatrix}, \quad B = \begin{pmatrix} -b & 0 \\ a & 0 \end{pmatrix}. \end{align} Then $\|A\|_F = \|B\|_F = \sqrt{a^2 + b^2}$, while $\|AB\|_F = 0$.

What if the two matrices are symmetric? Consider \begin{align} A = \begin{pmatrix} a & b \\ b & a \end{pmatrix}, \quad B = \begin{pmatrix} -b & a \\ a & -b \end{pmatrix}, \quad A B = \begin{pmatrix} 0 & a^2-b^2 \\ a^2-b^2 & 0 \end{pmatrix}. \end{align} Then, $\|A\|_F^2 = \|B\|_F^2 = 2(a^2 + b^2)$ while $\|AB\|_F^2 = 2(a^2 - b^2)^2$ which can be made arbitrarily smaller than either of $\|A\|_F^2$ or $\|B\|_F^2$. For example, take $a=b$.

$\endgroup$
2
  • 1
    $\begingroup$ Nice answer, thanks! Is there a (perhaps sharp or non-trivial) lower bound for the Frobenius norm of the product of two matrices? Thanks! $\endgroup$ Commented Mar 6, 2020 at 10:34
  • $\begingroup$ @Mathmath, no. See my edit to the response. $\endgroup$
    – passerby51
    Commented Mar 6, 2020 at 18:06
6
$\begingroup$

In case anyone is curious, there is also a lower bound in a form similar to @passerby51's answer. This result is also used in an ICLR paper, which may be very useful. Let $\mathbf{A}$ be $m \times r$ and $\mathbf{B}$ be $r\times n$. The bounds are $$\sigma_{\min }(\mathbf{A})\|\mathbf{B}\|_{F} \leq \|\mathbf{A B}\|_{F} \leq \sigma_{\max }(\mathbf{A})\|\mathbf{B}\|_{F},$$ where $\sigma_\min$ and $\sigma_\max$ denote the minimum and maximum singular value, respectively.

To prove it, we start with the definition of Frobenius norm, $$ \|\mathbf{A B}\|_{F}^{2}=\operatorname{trace}\left(\mathbf{A B} \mathbf{B}^{\top} \mathbf{A}^{\top}\right)=\operatorname{trace}\left(\mathbf{A}^{\top} \mathbf{A B} \mathbf{B}^{\top}\right) $$ By using the inequalities for matrix trace (see reference below or here), i.e., $ \lambda_{\min }(A) \operatorname{tr}(B) \leq \operatorname{tr}(A B) \leq \lambda_{\max }(A) \operatorname{tr}(B)$, we have $$ \lambda_{\min }\left(\mathbf{A}^{\top} \mathbf{A}\right) \operatorname{trace}\left(\mathbf{B} \mathbf{B}^{\top}\right) \leq \operatorname{trace}\left(\mathbf{A}^{\top} \mathbf{A B} \mathbf{B}^{\top}\right) \leq \lambda_{\max }\left(\mathbf{A}^{\top} \mathbf{A}\right) \operatorname{trace}\left(\mathbf{B} \mathbf{B}^{\top}\right) $$ where $\lambda_\min$ and $\lambda_\max$ denote the minimum and maximum eigenvalues, respectively. This implies, $$ \sigma_{\min }^2(\mathbf{A})\|\mathbf{B}\|_{F}^2 \leq \|\mathbf{A B}\|_{F}^2 \leq \sigma_{\max }^2(\mathbf{A})\|\mathbf{B}\|_{F}^2. $$ Using the fact that all items here are non-negative, we can complete the proof.

Reference:

Y. Fang, et al., Inequalities for the trace of matrix product.

$\endgroup$
1
  • $\begingroup$ This is a nice addition with the trick to reduce the problem to the PSD case which is easier to handle. $\endgroup$
    – passerby51
    Commented Mar 8, 2023 at 17:59
2
$\begingroup$

We will prove that $$ \Vert FG \Vert_f^2 \leq \Vert F \Vert_f^2 \cdot \Vert G \Vert_f^2.$$ We have $$\Vert FG \Vert_f^2 = \mathsf{Tr}(FG G^TF^T) = \mathsf{Tr} (F^TFGG^T),$$ by the cyclic property of trace function. To prove the theorem, it is enough if we show that $$\mathsf{Tr}(F^TFGG^T) \leq \mathsf{Tr}(F^TF) \mathsf{Tr}(GG^T).$$ Observe that $F^TF$ and $GG^T$ are positive semidefinite and symmetric matrices. Hence, they are diagonalizable and can be written as $$ F^TF = U\Sigma_F U^\dagger$$ $$GG^T = V\Sigma_G V^\dagger,$$ where $U,V$ are unitary matrices and $\Sigma_F,\Sigma_G$ are diagonal matrices with eigenvalues (non-negative) of $F^TF$ and $GG^T$ respectively as diagonal entries. Again, by using the cyclic property of trace function, we can write the left hand side as $$\mathsf{Tr}(F^TFGG^T) = \mathsf{Tr}(U\Sigma_F U^\dagger V\Sigma_G V^\dagger) = \mathsf{Tr}(V^\dagger U\Sigma_F U^\dagger V\Sigma_G ).$$

It is easy to show the following properties of diagonal matrices: Let $D$ be a diagonal matrix with non-negative diagonal entries.

1) Under any unitary transformation of $D$, the resulting matrix has non-negative diagonal entries.

2) If $M$ is a square matrix with non-negative diagonal entries, then $\mathsf{Tr}(MD)\leq \mathsf{Tr}(M) \cdot \mathsf{Tr} (D)$.

The above properties directly imply that $$\mathsf{Tr}(V^\dagger U\Sigma_F U^\dagger V\Sigma_G ) \leq \mathsf{Tr}(V^\dagger U\Sigma_F U^\dagger V) \cdot \mathsf{Tr}( \Sigma_G ) =\mathsf{Tr}( \Sigma_F ) \cdot \mathsf{Tr}( \Sigma_G ) = \mathsf{Tr}(F^TF) \mathsf{Tr}(GG^T),$$ where the last two equalities follow from the fact that trace is preserved under unitary transformation.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .