2
$\begingroup$

I was reading this notebook from the PyMC3 documentation about Dirichlet Process Mixtures and, on the last figure, the estimated density reaches almost zero for a particular value, despite the histogram showing that many data points fall close to that value. Now, I am wondering:

  1. Why does this zero appear? If the expected posterior density is close to zero, then all the sampled densities must be almost zero, right? Why would this happen, if many data points are close to that value?
  2. Is this a MCMC artifact? The author of the notebook checked that they are using enough Dirichlet components, but maybe there is some other source of bias?
  3. Is this model misspecification? I would doubt that, given how flexible Dirichlet processes are supposed to be, but this result seems so odd that I just don't know anymore.

The overall incompatibility between the histogram and the Dirichlet process mixture posterior is very odd to me, but that zero seems to be the bigger issue.

$\endgroup$

0