0
$\begingroup$

How to approach the following two cases is clear, I am mentioning them to set up my question.

(Case 1): For data that appears to be a Gaussian distribution, we can assume the distribution is Gaussian and estimate its parameters with maximum likelihood estimation.

(Case 2): When we can't make any assumptions about the distribution of the data, we can employ a non-parametric technique such as kernel density estimation.

Now, what about the case when the distribution appears to be a mixture of two Gaussians? Here we don't can't make the assumption that the distribution is from a simple family like in the case of a single Gaussian. But we are still 'far better off' than (Case 2) because can make assumptions about the shape of the distributions, so we don't have to deal with infinite dimensions, instead we just need 5 parameters to specify the distribution; two means, two variances, mixture variable.

So what technique can we use in the bimodal case? I've seen many papers where they use kernel density estimation (KDE) techniques for estimation in the bimodal case..but I'm not sure if they are just doing that because they want to experiment with their ideas on a simple known distribution. Or are non-parameteric methods actually the standard way to perform estimation for the bimodal case?

$\endgroup$
4
  • 1
    $\begingroup$ The "standard" way to estimate the parameters of any distribution is maximum likelihood. For a mixture of normals this is usually done via the EM algorithm. It's possible that the papers you refer to weren't actually concerned with estimates of parameters (just the general shape of the density), or that they did not want to assume the data came from a specific family of distributions. $\endgroup$
    – Chris Haug
    Commented Aug 12, 2020 at 10:50
  • $\begingroup$ For example, the paper by "Deconvolution with supersmooth distributions" by Fan, 1992. He uses a bimodal distribution in Simulation 2 (page 162/163). Do you think he does this just to support his theoretical results? And that in practise, nonparametric methods are generally never used for the bimodal case? $\endgroup$ Commented Aug 12, 2020 at 11:11
  • $\begingroup$ @ChrisHaug On page 26 of 'The Nature of Statistical Learning Theory' by Vapnick, the author states "that using the maximum likelihood method it is impossible to estimate the parameters of a density that is a mixture of normal densities" and shows that this is because the objective function does not have a maximum. So does this contradict your comment or have I misinterpreted something?? $\endgroup$ Commented Aug 14, 2020 at 17:55
  • 1
    $\begingroup$ I think what the author is saying is true in a technical sense, but those problematic solutions (infinite density spike at an observation) are not useful so should be avoided (we want the maximum, non-problematic solution). The EM algorithm does approximate maximization of the likelihood, and is a standard way to fit a mixture of Gaussians (see e.g. "Elements of statistical learning", section 8.5). $\endgroup$
    – Chris Haug
    Commented Aug 14, 2020 at 19:02

0