Maximum Entropy Continuous Distribution

Question

In Pattern Recognition and Machine Learning Ch 1.6, the author derives the distribution which maximises the differential entropy;

$$H(\textbf{x})-\int p(\textbf{x}) \ln (p(\textbf{x})) d\textbf{x}$$

To do so the author comes up with three constraints;

$$\int_{-\infty}^{\infty} p(x) dx = 1$$ $$\int_{-\infty}^{\infty} xp(x) dx = \mu$$ $$\int_{-\infty}^{\infty} (x-\mu)^2p(x) dx = \sigma^2$$

This results in the Lagrangian functional;

$$F(p)=-\int_{-\infty}^{\infty} p(x) \ln(p(x)) dx + \lambda_1(\int_{-\infty}^{\infty} p(x) dx - 1) + \lambda_2 (\int_{-\infty}^{\infty} x p(x) dx - \mu) + \lambda_3(\int_{-\infty}^{\infty} (x-\mu)^2 p(x) dx - \sigma^2)$$

Taking the derivative of this functional using the calculus of variations and setting it equal to zero gives;

$$p(x)=\exp(-1+\lambda_1+\lambda_2 x + \lambda_3 (x-\mu)^2)$$

The author states that you can find the Lagrange multipliers by back substitution of this result into the three constraint equations, leading to the conclusion that $p(x)$ is a normal density.

I'm wondering how to derive this last step, specifically how to find the Lagrange multipliers. If we substitute back into the constraints we get three integral equations with three unknowns. How would I go about solving these equations?

score 0 · Accepted Answer · 2020-10-18 10:25:30Z

Assume that $\mu=0$ and $\sigma=1$, and let $z:=\sqrt{\pi}e^{-1+\lambda_1}e^{-\lambda_2^2/(4\lambda_3)}$. Then, assuming that $\lambda_3<0$, the equations are $$ I_1:=e^{-1+\lambda_1}\int_{-\infty}^{\infty} e^{\lambda_2x+\lambda_3x^2}\,dx=\frac{z}{(-\lambda_3)^{1/2}}=1, $$ $$ I_2:=e^{-1+\lambda_1}\int_{-\infty}^{\infty} xe^{\lambda_2x+\lambda_3x^2}\,dx=\frac{z\lambda_2}{2(-\lambda_3)^{3/2}}=0, \quad\text{and} $$ $$ I_3:=e^{-1+\lambda_1}\int_{-\infty}^{\infty} x^2e^{\lambda_2x+\lambda_3x^2}\,dx=\frac{z\lambda_2^2}{4(-\lambda_3)^{5/2}}+\frac{z}{2(-\lambda_3)^{3/2}}=1. $$ Plugging $z=(-\lambda_3)^{1/2}$, we get $$ \frac{\lambda_2}{-\lambda_3}=0\quad\text{and}\quad \frac{\lambda_2^2}{4\lambda_3^2}+\frac{1}{-2\lambda_3}=1, $$ so that $\lambda_2=0$ and $\lambda_3=-1/2$. Finally, using $z=(-\lambda_3)^{1/2}$, we get $\lambda_1=1-\ln \sqrt{2\pi}$.

Therefore, $$ p(x)=e^{-\ln \sqrt{2\pi}-x^2/2}=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}. $$

For the general case, consider $y=(x-\mu)/\sigma$ and notice that $$ -\int p(x)\ln(p(x))\,dx=-\frac{1}{\sigma}\int p(y)\ln(p(y))\, dy. $$

Evaluation of $I_1$, $I_2$, and $I_3$:

First, recall that for $c>0$, $$ \int_{-\infty}^\infty e^{-cx^2}\,dx=\sqrt{\frac{\pi}{c}}, $$ and notice that $$ bx-cx^2=-c\left(\frac{b}{2c}-x\right)^2+\frac{b^2}{4c}. $$ Thus, letting $\lambda_1=a$, $\lambda_2=b$, and $\lambda_3=-c$, $$ I_1=e^{-1+a}e^{b^2/(4c)}\int_{-\infty}^\infty e^{-c(b/(2c)-x)^2}\,dx=e^{-1+a}e^{b^2/(4c)}\times \sqrt{\frac{\pi}{c}}, $$ As for the second integral, notice that $$ \int_{-\infty}^\infty \left(x-\frac{b}{2c}\right)e^{-c(b/(2c)-x)^2}=0, $$ and so $I_2=I_1b/(2c)$. Finally, $$ \frac{d}{dc}\int e^{-c(b/(2c)-x)^2}\,dx =\int \left(\frac{b^2}{4c^2}-x^2\right)e^{-c(b/(2c)-x)^2}\,dx. $$ Therefore, $$ I_3=I_1\frac{b^2}{4c^2}-e^{-1+a}e^{b^2/(4c)}\times\frac{d}{dc}\sqrt{\frac{\pi}{c}}. $$

@tail_recursion I added the limits of integration for clarity. — user140541, Commented Oct 17, 2020 at 10:42
Would be useful if you could add some more detail on how to do the integrals. I'm getting limits involving the imaginary error function erfi, where the argument is going to $\pm \infty$ so the limits don't exist. — tail_recursion, Commented Oct 17, 2020 at 11:11
I'm still not sure what you did there. I'm not clear how you got from the second step to the last step. I was however able to do the first integral using a formula given here; en.wikipedia.org/wiki/Gaussian_integral — tail_recursion, Commented Oct 18, 2020 at 5:37

Stack Exchange Network

Maximum Entropy Continuous Distribution

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
calculus
statistics
multivariable-calculus
machine-learning
integral-equations
.

Linked

Hot Network Questions

Maximum Entropy Continuous Distribution

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged calculusstatisticsmultivariable-calculusmachine-learningintegral-equations.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
calculus
statistics
multivariable-calculus
machine-learning
integral-equations
.