Entropy derivation

Question

The multiplicity is defined as: $$ W = \dfrac{N!}{\prod_i n_i!} $$ Thus, the entropy is defined as the multiplicity scaled by some constant $$ H = \frac{1}{N} \ln N! - \frac{1}{N}\sum_i \ln n_i! $$ Taking the limit $N\rightarrow \infty$ and applying Stirling's approximation: $$ H = -\lim_{N\rightarrow \infty} \sum\dfrac{n_i}{N} \ln\dfrac{n_i}{N}= - \sum_i p_i\ln p_i $$

This last step, I can't see how did we get there.

P.S I know there is a very similar question but I still don't get it...

Thank you in advance.

All the $n_i$ go to $\infty$ such that $n_i/N = p_i$ is finite $\forall\ i$. That was first found by Planck a century ago. — Felix Marin, Commented Oct 13, 2020 at 17:12
Well, that's obvious, the question is how did we derive $H = -\lim_{N\rightarrow \infty} \sum\dfrac{n_i}{N} \ln\dfrac{n_i}{N}= - \sum_i p_i\ln p_i$ from $H = \frac{1}{N} \ln N! - \frac{1}{N}\sum_i \ln n_i!$? — cosine, Commented Oct 14, 2020 at 7:37

stochasticboy321 · Accepted Answer · 2020-10-14 15:48:08Z

Man, this is so sloppy. I assume that there are some $K$ classes, so $i \in [1:K]$. For $(n_1,\dots, n_K)$ such that $\sum n_i = N,$ define $W_N(n_1, \dots, n_K),$ as above, and define $H_N(n_1,\dots, n_K) = \log W_N$. Then $H(p)$ may be defined as $$ H(p) = \lim_{\substack{N \to \infty\\ (n_1,\dots, n_K)/N \to p}} \frac{1}{N}H_N(n_1,\dots, n_K). $$

These are all definitions and they depend on different things. It is absurd to use the same symbol $H$ for all of them.

The broad idea behind such a definition is to imagine collecting $N$ iid samples from a law $p$. With extremely high probability, the numbers obtained for each class will be $Np_i + o(N),$ and the probability will be roughly equidistributed over such configurations. $W_N$ counts the number of configurations with a given set of multiplicities in the samples. Due to the equidistribution and the extremely high probability I was talking about, if you take $n_i = Np_i + o(N),$ the log of this is roughly the entropy of $N$ independent draws from the source $p$. Due to additivity of entropy over idependent sources, this itself is roughly $N$ times the entropy of $p$. Normalising by $N$ and taking the limit then gives you the entropy of the source itself.

Strictly speaking, it is not obvious that normalising and the limit gives the entropy, or that this definition leads to the correct entropy of $p$ (as defined directly as $-\sum p_i \log p_i$), since we need to deal with all the 'roughly's and 'extremely high's above. Usually this is shown via a law of large numbers.

As for the actual question, this amounts to showing that as $\min n_i \to \infty,$ $\log W_N = -\sum n_i/N \log(n_i/N) + o(1). $ For this, use Stirling's approximation to find that $\log n_i! = n_i \log n_i - n_i + O(1),$ and recall that $\sum n_i = N$.

So, $$H_N = \frac{1}{N}( \log N! - \sum \log n_i!) = \log N - 1 - \sum \frac{n_i}{N} \log n_i + \sum \frac{n_i}{N} + O(K)/N,$$

If $K = o(N),$ then the error term is $o(1)$ (this is not an issue, implicitly in the questoin we're assuming $K = O(1)$). Further, since $\sum n_i = N,$ $1 = \sum n_i/N,$ and $\log N = \sum (n_i/N) \log N$. You end up with $$ H_N = - \sum \frac{n_i}{N} \log {n_i} + \sum \frac{n_i}{N} \log N + o(1)\\ = - \sum \frac{n_i}{N} \log \frac{n_i}{N} + o(1).$$

Stack Exchange Network

Entropy derivation

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
probability
information-theory
statistical-mechanics
.

Linked

Hot Network Questions

Entropy derivation

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged probabilityinformation-theorystatistical-mechanics.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
probability
information-theory
statistical-mechanics
.