How to approach the following two cases is clear, I am mentioning them to set up my question.
(Case 1): For data that appears to be a Gaussian distribution, we can assume the distribution is Gaussian and estimate its parameters with maximum likelihood estimation.
(Case 2): When we can't make any assumptions about the distribution of the data, we can employ a non-parametric technique such as kernel density estimation.
Now, what about the case when the distribution appears to be a mixture of two Gaussians? Here we don't can't make the assumption that the distribution is from a simple family like in the case of a single Gaussian. But we are still 'far better off' than (Case 2) because can make assumptions about the shape of the distributions, so we don't have to deal with infinite dimensions, instead we just need 5 parameters to specify the distribution; two means, two variances, mixture variable.
So what technique can we use in the bimodal case? I've seen many papers where they use kernel density estimation (KDE) techniques for estimation in the bimodal case..but I'm not sure if they are just doing that because they want to experiment with their ideas on a simple known distribution. Or are non-parameteric methods actually the standard way to perform estimation for the bimodal case?