21
$\begingroup$

Reading about the true meaning of 95% confidence ellipse, I tend to come across 2 explanations :

  1. The ellipse that contains 95% of the data
  2. Not the above, but the ellipse that explains the variance of the data. I am not sure I understand correctly but they seem to mean that if a new data point coming in, there is a 95% chance that the new variance will stay in the ellipse.

Can you shed some light?

$\endgroup$

2 Answers 2

23
$\begingroup$

Actually, neither explanation is correct.

A confidence ellipse has to do with unobserved population parameters, like the true population mean of your bivariate distribution. A 95% confidence ellipse for this mean is really an algorithm with the following property: if you were to replicate your sampling from the underlying distribution many times and each time calculate a confidence ellipse, then 95% of the ellipses so constructed would contain the underlying mean. (Note that each sample would of course yield a different ellipse.)

Thus, a confidence ellipse will usually not contain 95% of the observations. In fact, as the number of observations increases, the mean will usually be better and better estimated, leading to smaller and smaller confidence ellipses, which in turn contain a smaller and smaller proportion of the actual data. (Unfortunately, some people calculate the smallest ellipse that contains 95% of their data, reminiscent of a quantile, which by itself is quite OK... but then go on to call this "quantile ellipse" a "confidence ellipse", which, as you see, leads to confusion.)

The variance of the underlying population relates to the confidence ellipse. High variance will mean that the data are all over the place, so the mean is not well estimated, so the confidence ellipse will be larger than if the variance were smaller.

Of course, we can calculate confidence ellipses also for any other population parameter we may wish to estimate. Or we could look at other confidence regions than ellipses, especially if we don't know the estimated parameter to be (asymptotically) normally distributed.

The one-dimensional analogue of the confidence ellipse is the , and browsing through previous questions in this tag is helpful. Our current top-voted question in this tag is particularly nice: Why does a 95% CI not imply a 95% chance of containing the mean? Most of the discussion there holds just as well for higher dimensional analogues of the one-dimensional confidence interval.

$\endgroup$
1
$\begingroup$

It depends on the area this concept applies to. What was said above is true for statistics but when we apply stats to other subjects things are a bit different. In biomechanics, for example, we use the term confidence ellipse (though there is a debate whether it should be prediction ellipse) as a technique for measuring the centre of pressure displacement when a subject stands on a force platform. Then the ellipse that is drawn around the two axes (major and minor) is supposed to contain the 95% of the data points that represent the centre of pressure displacement over the time of a trial.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.