In deriving the parameter of a posterior, is it necessary to use the likelihood over $n$ samples?

Question

In a test I had to derive the posterior of the multinomial distribution with the conjugate Dirichlet prior. I used common relation $$p(\mu|X;\alpha) \propto P(X|\mu) P(\mu|\alpha).$$ I did, however, assume $X$ is a single random variable, not a data set. This led me to conclude that the posterior can be written as Dirichlet with parameter $\alpha^*=\alpha+x$, where $\alpha$ and $x$ of dimension $K$ (classes). On the Wikipedia entry for prior giving the posterior parameterizations, all distributions are given for a sample of $n$ data points, hence $\alpha^*=\alpha+ \sum_{i=1}^{n}x$. Is my solution still correct if I want to show that the posterior is a Dirichlet and what's its parameter. More generally I am unsure if posterior distributions are only defined for $n$ data points $X$ (as the Wikipedia entry implies) or can be derived for single $X$ as well.

Setting other things aside: if something is defined for $n$ points, what exactly is the problem with $n=1$ ..? Your question is not really clear, but you can apply Bayes theorem to single point, or to multiple points, the same as you could use least squares estimation to find the best parameter given single point (but you won't learn anything revealing...). — Tim, Commented Apr 25, 2017 at 21:20
@Tim It seems that for showing it is a conjugate prior, it is enough to do this with $n=1$? — tomka, Commented Apr 25, 2017 at 21:33
You can do it all-at-once or sequentially, it will be the same. — Tim, Commented Apr 26, 2017 at 8:08

tomka · Accepted Answer · 2017-04-26 12:00:41Z

To show that Dirichlet is conjugate prior to multinomial, it is indeed sufficient to use one $X$. For estimation purposes, however, a all-in-one procedure would factor over $n$ independent samples, yielding the Wikipedia entry, or would use a sequential updating step repeatedly with one (randomly selected) sample of all $n$ samples. In the updating the Dirichlet prior hyper-parameter would change from the initial $\alpha$ by adding observations $x_i$ repeatedly. The two procedures are equivalent. The parameters of the posterior thus depend on the purpose.

Stack Exchange Network

In deriving the parameter of a posterior, is it necessary to use the likelihood over $n$ samples?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
bayesian
mathematical-statistics
likelihood
posterior
or ask your own question.

Linked

Hot Network Questions

In deriving the parameter of a posterior, is it necessary to use the likelihood over $n$ samples?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged bayesianmathematical-statisticslikelihoodposterior or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
bayesian
mathematical-statistics
likelihood
posterior
or ask your own question.