7
$\begingroup$

The Shannon entropy is the average of the negative log of a list of probabilities $ \{ x_1 , \dots , x_d\} $, i.e. $$ H(x)= -\sum\limits_{i=1}^d x_i \log x_i $$ there are of course lots of nice interpretations of the Shannon entropy. What about the variance of $ -\log x_i $ ? $$ \sigma^2 (-\log x)=\sum\limits_i x_i (\log x_i )^2-\left( \sum\limits_i x_i \log x_i \right)^2 $$ does this have any meaning / has it been used in the literature?

$\endgroup$

2 Answers 2

4
$\begingroup$

$\log 1/x_i$ is sometimes known as the 'surprise' (e.g. in units of bits) of drawing the symbol $x_i$, and $\log 1/X$, being a random variable, has all the operational meanings that come with any random variable, namely, entropy is the average 'surprise'; similarly, higher moments are simply higher moments of the surprise measure of $X$.

There is indeed a literature on using the variance of information measures (not of surprise in this case, but of divergence), here are two good places to get started on a concept called 'dispersion': http://people.lids.mit.edu/yp/homepage/data/gauss_isit.pdf http://arxiv.org/pdf/1109.6310v2.pdf

The application is clear. When you only know the expected value of a random variable, you know it at first order. But when you need to get tighter bounds you need to use higher moments.

$\endgroup$
1
$\begingroup$

More generally, this paper does talk about higher moments of information (though there does not seem to be that much follow-up work on this):

H. Jürgensen, D. E. Matthews, "Entropy and Higher Moments of Information", Journal of Universal Computer Science vol 16, nr. 5 (2010)

Link here: http://www.jucs.org/jucs_16_5/entropy_and_higher_moments/jucs_16_05_0749_0794_juergensen.pdf

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .