Microstate interpretation of Shannon entropy

Question

Let ${A}$ be a finite non-empty set of some cardinality ${|A|}$, and let ${X}$ be a random variable taking values in ${A}$. Define the Shannon entropy ${{\bf H}(X)}$ to be the quantity $\displaystyle {\bf H}(X) := \sum_{x \in A} {\bf P}(X = x) \log \frac{1}{{\bf P}(X=x)}$, with the convention that ${0 \log \frac{1}{0}=0}$. Let ${\varepsilon > 0}$ and ${n}$ be a natural number. Let ${X_1,\dots,X_n}$ be ${n}$ iid copies of ${X}$, thus ${\vec X := (X_1,\dots,X_n)}$ is a random variable taking values in ${A^n}$, and the distribution ${\mu_{\vec X}}$ is a probability measure on ${A^n}$. Let ${\Omega \subset A^n}$ denote the set

$\displaystyle \Omega := \{ \vec x \in A^n: \exp(-(1+\varepsilon) n {\bf H}(X)) \leq \mu_{\vec X}(\{\vec x\}) \leq \exp(-(1-\varepsilon) n {\bf H}(X)) \}$.

Show that if ${n}$ is sufficiently large, then $\displaystyle {\bf P}( \vec X \in \Omega) \geq 1-\varepsilon$ and $\displaystyle \exp((1-2\varepsilon) n {\bf H}(X)) \leq |\Omega| \leq \exp((1+2\varepsilon) n {\bf H}(X))$. (Hint: use the weak law of large numbers to understand the number of times each element ${x}$ of ${A}$ occurs in ${\vec X}$.)

Question: Applying the weak law of large numbers on the sequence of random variables $\displaystyle \log \frac{1}{{\bf P}(X_i)}$, one gets $\displaystyle {\bf P}(|\log (\prod_{i=1}^n\frac{1}{{\bf P}(X_i)}) - n{\bf H}(X)| > n\varepsilon {\bf H}(X)) \to 0$ as $n \to \infty$, giving the first claim. For the second claim however, I'm not sure how the given hint relates to the size of the set $\Omega$.

stochasticboy321 · Accepted Answer · 2024-05-28 18:08:44Z

0

By definition, for any $\vec{x} \in \Omega,$ $\mu(\vec{x}) \ge \exp( - (1+\varepsilon) n H(x)).$ So, $$ |\Omega| \exp( - (1+\varepsilon) n H) \le \sum_{\vec x \in \Omega} \mu(\vec x) = \mu(\Omega) \le 1,$$ since $\mu$ is a probability measure. This implies that $|\Omega| \le \exp( (1+\varepsilon)n H)$.

On the other side, for large enough $n$, $\mu(\Omega) \ge 1-\varepsilon$ and the upper bound on $\mu(\vec x)$ for $\vec x \in \Omega$ gives $$ |\Omega| \ge (1-\varepsilon) \exp( (1-\varepsilon) nH).$$ So you're done if it holds for any $\varepsilon$ that for large enough $n$, $1-\varepsilon \ge \exp( - n \varepsilon H)$, which holds since the latter term goes to zero with $n$, while the former is fixed.

answered May 28 at 18:08

stochasticboy321

9,2981 gold badge12 silver badges28 bronze badges

$\begingroup$ Thank you, the author did state that the second claim follows directly from the first and the definition of $\Omega$, clearly one needs only to note the fact that $\displaystyle 1 - \varepsilon \leq \sum_{{\vec x} \in \Omega} \mu_{{\vec X}} = {\bf P}({\vec X} \in \Omega) \leq 1$. $\endgroup$
– shark
Commented May 28 at 20:52

Add a comment |

Stack Exchange Network

Microstate interpretation of Shannon entropy

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
probability-theory
solution-verification
information-theory
.

Hot Network Questions

Microstate interpretation of Shannon entropy

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged probability-theorysolution-verificationinformation-theory.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
probability-theory
solution-verification
information-theory
.