Working through Bishop's Pattern Recognition and Machine Learning(a great read so far!) and on page 67 he says:
"One limitation of the parametric approach is that it assumes a specific functional form for the distribution which may turn out to be inappropriate for a particular application"
Why might it be that the functional form of a distribution is inappropriate for a particular application? An illustrative example would also be much appreciated.