22
$\begingroup$

We have to measure a period of an oscillation. We are to take the time it takes for 50 oscillations multiple times.

I know that I will have a $\Delta t = 0.1 \, \mathrm s$ because of my reaction time. If I now measure, say 40, 41 and 39 seconds in three runs, I will also have standard deviation of 1.

What is the total error then? Do I add them up, like so?

$$\sqrt{1^2 + 0.1^2}$$

Or is it just the 1 and I discard the (systematic?) error of my reaction time?

I wonder if I measure a huge number of times, the standard deviation should become tiny compared to my reaction time. Is the lower bound 0 or is it my reaction time with 0.1?

$\endgroup$

3 Answers 3

21
$\begingroup$

Yes, the only sensible formula for the total error is the sum in quadrature, $$ \Delta X_{\rm total} = \sqrt { \Delta X_{\rm syst}^2 + \Delta X_{\rm stat}^2 } $$ The key assumption behind the validity of the formula is that the two sources of error are independent i.e. uncorrelated. $$ \langle \Delta X_{\rm syst} \Delta X_{\rm stat} \rangle = 0$$ Because of that, we have $$\langle \Delta X_{\rm total}^2 \rangle = \langle (\Delta X_{\rm syst} +\Delta X_{\rm stat} )^2 \rangle = \sigma_{\rm stat}^2 + \sigma_{\rm syst}^2$$ The term $2ab$ from $(a+b)^2=a^2+2ab+b^2$ drops out because of the independence quoted in the previous displayed equation. The last displayed equation is a full proof of your formula.

I want to emphasize that the Pythagorean formula doesn't depend on any normality of the distributions. It's just simple linear algebra used in computing the expectation value of a bilinear expression in which the mixed terms contribute zero because of the independence above. If someone tells you that you have to assume the central limit theorem or Gaussianity of the distribution, she is just wrong.

Of course, if one wants to convert the information about the error margin to $p$-values, i.e. confidence levels, one needs to know the shape of the distribution, i.e. to assume it is Gaussian. If both the systematic and statistical error are distributted via the Gaussian distribution, so is the total error. But if we don't talk about $p$-values, we don't need to assume anything whatsoever about the Gaussianity.

However, it's very useful to separate the systematic and statistical error because if you repeat some measurement with the same equipment, the statistical error adds in quadrature but the systematic error adds linearly.

This statement means that the statistical errors from independent "runs" of the same experiment are uncorrelated with each other $$ \langle \Delta X_{\rm stat1} \Delta X_{\rm stat2} \rangle = 0$$ and they're still uncorrelated with all the systematic error, too. However, the systematic errors are linked to the device which is still the same, so the systematic errors from 2 repeated "runs" are perfectly correlated: $$ \langle \Delta X_{\rm syst1} \Delta X_{\rm syst2} \rangle = \sigma(\Delta X_{\rm syst1})\sigma(\Delta X_{\rm syst2}) \neq 0$$ On a $2D$ plane, the distribution function would be concentrated near the "diagonal" tilted line $\Delta X_{\rm syst1} = k \Delta X_{\rm syst2} $. Be careful, under some conditions, the result above would need a minus sign. This linearity makes a difference. In particular, the statistical errors for "intensive quantities" may be reduced by repeating the experiment while the systematic errors can't.

Imagine that the LHC measures the decay rate of a particle as $\Gamma=CP$ where $C$ is a fixed constant without error and $P$ is the percentage of their events (collisions) that have a certain property. Let's make two runs with $n_1$ and $n_2$ events, respectively. They're expected to give the same number $n$ and the total number is $N=2n$.

However, the first run has $\Delta n_1$ with both statistical and systematic component and the same for $\Delta n_2$. What's the total number of collisions? We measured $n_1+n_2$ collisions but this result has an error margin (more precisely, I will be talking about the error margin of $\Gamma$ with the right coefficient). For the error we have $$ \Delta N = \Delta n_1+\Delta n_2 = \Delta n_{\rm 1stat}+\Delta n_{\rm 1syst}+\Delta n_{\rm 2stat}+\Delta n_{\rm 2syst}$$ What is the expectation value of its square? $$ \langle (\Delta N)^2\rangle = (\Delta n_{\rm 1syst}+\Delta n_{\rm 2syst})^2 + (\Delta n_{\rm 1stat})^2 + (\Delta n_{\rm 2stat})^2$$ Note that the statistical errors from the two runs were first squared and then added; for the systematic errors, they were first added and then squared. As the result, the systematic contribution to the error of the decay rate won't change when you make another, second run. The statistical error will drop by the factor of $1/\sqrt{2}$. Because the different parts of the total error behave differently, it's good to know the errors separately.

But if you only use an apparatus or setup once and then you destroy it, there's no reason to remember the separation and the right total error margin is obtained by adding them in quadrature. That's what many high-energy experimental teams did and the reason is not that they're sloppy about statistics. The Pythagorean calculation is perfectly valid and may be used by those who know what they're doing. Just the "beginners in statistics" at school are discouraged to combine these things in quadrature because they could add the errors incorrectly if they consider many measurements with the same device.

But adding the systematic and statistical error margins linearly would always be wrong because they're always independent of one another. It would produce a larger numerical value of the error margin than the Pythagorean formula and a larger error is found "OK" by some people because it makes the experimenters sound more cautious or "more conservative". But it's still a wrong result, anyway. If someone found a 5-sigma evidence/proof for an effect using the Pythagorean formula for the error margin and you would deny her 5-sigma evidence/proof because you would calculate your error margin that is overstated (probably by the simple sum of the syst. and stat. error margins), therefore getting just 3 sigma, then you would be a denier of a valid experimental proof of an effect which is bad whether or not you can also claim to be "conservative" or "cautious". ;-)

There's only one right formula in science and for a single statistical and single systematic error, it's given by your Pythagorean formula.

$\endgroup$
2
  • $\begingroup$ This analysis also assumes random (Gaussian distributed) errors. It's a good analysis for many situations, and the only practical thing to do at the introductory level. But calling this the "only one right formula in science" overstates things. $\endgroup$
    – garyp
    Commented Apr 16, 2017 at 13:37
  • $\begingroup$ No, garyp, the formulae for the addition of variances or standard deviations etc. hold for all distributions, not just the normal one. $\endgroup$ Commented Apr 17, 2017 at 9:18
7
$\begingroup$

I think you're exercising an incorrect picture of statistics here - mixing the inputs and outputs. You are recording the result of a measurement, and the spread of these measurement values (we'll say they're normally distributed) is theoretically a consequence of all of the variation from all different sources.

That is, every time you do it, the length of the string might be a little different, the air temperature might be a little different. Of course, all of these are fairly small and I'm just listing them for the sake of argument. The point is that the ultimate standard deviation of the measured value $\sigma$ should be the result of all individual sources (we will index by $i$), under the assumption that all sources of variation are also normally distributed.

$$\sigma^2 = \sum_i^N{\sigma_i^2}$$

When we account for individual sources of variation in an experiment, we exercise some model that formalizes our expectation about the consistency of the experiment. Your particular model is that the length of the string (for instance) changes very little trial after trial compared to the error introduced by your stopwatch timing. Unless we introduce other errors, this is claiming $N=1$, and if the standard deviation of your reaction timing contributes $0.1 s$ to the standard deviation of the measurement, then theoretically the measurement should have that standard deviation as well.

If this conflicts with the statistics of the time you actually recorded, then the possible ways to account for this include:

  • Your reaction time isn't as good as you thought it was
  • There are other sources of experimental error

I would favor the latter, although it could be a combination of both of them.

$\endgroup$
2
$\begingroup$

Errors are given so as to assign a probability of deviation from the true value, they are the uncertainty in measurement. There is some confusion about systematic errors, i.e. errors that do not come from statistical fluctuations but from the method of measurement used, "your measurement error".

Systematic errors are the dominant ones when the statistical become very small, as will be the case if you make very many measurements and your reaction time is left as the main uncertainty.

The best way in my opinion is to quote both errors and not to combine them in quadrature. If the final value is the result of a fit, as was the case for the mass and width of the Z boson for example, then the systematics are taken care during the fit and a single error is quoted.

At present, people tend to add all errors in quadrature, quoting a Central limit theorem which essentially argues that systematic errors also come from random normal distributions . One should point out that this is not always the case, particularly for scale errors as was amply demonstrated by the superluminal neutrino premature announcement. If they had included to their errors a +/-70 nanosecond systematic ( from a bad connection), they would not have thought that anything exceptional had been measured.

$\endgroup$
4
  • $\begingroup$ Apologies, Anna, I had to downvote you because the last paragraph is just plain wrong. One doesn't need any Gaussianity of the distributions and/or the central limit theory to use the Pythagorean formula; the only assumption one needs to prove it is the independence of the statistical and systematic error. Your wrong formulae could have prevented OPERA from announcing a 6-sigma discovery but the wrongness of the discovery had nothing to do with your proposed alternative rules of statistics which are exactly as wrong as OPERA's loose fiber cable which was the true reason of their too high speed $\endgroup$ Commented Apr 9, 2012 at 14:31
  • 3
    $\begingroup$ @LubošMotl I expected you to, because we have argued about this before. I am not proposing a formula, I am proposing to keep track of systematic errors separately from statistical, not to add them. When one knows that there is a systematic uncertainty of a given value that is larger than the statistical one, that information should not be hidden in quadrature, imo. $\endgroup$
    – anna v
    Commented Apr 9, 2012 at 14:39
  • $\begingroup$ Dear Anna, one still needs to evaluate the confidence levels etc. and to do so, one needs to know the correct total error. So you're not solving anything by saying that people shouldn't talk about the total error. They need to. Your OPERA example is a great example of that. Moreover, the last sentence of your last comment misses the point, too. The situation in which it's very important to use the sum in quadrature and not e.g. the total sum or, on the contrary, the greater number among the two is when the systematic and statistical errors are of the same order. Then errors of $1/\sqrt2$ arise $\endgroup$ Commented Apr 9, 2012 at 14:43
  • $\begingroup$ When you're in the opposite situation in which either the systematic error is much greater than the statistical one or vice versa then it is obvious what is approximately the total error. All formulae agree; it's the greater error among the two which is also equal to the hypertenuse or the sum within the approximation that the smaller error is much smaller, anyway. That's when there's surely no dilemma about the right magnitude of the total error. The OPERA claim was an example of when the 2 errors were comparable, almost equal. $7,7$ ns gave $10$. They were wrong; not because of bad maths. $\endgroup$ Commented Apr 9, 2012 at 14:46

Not the answer you're looking for? Browse other questions tagged or ask your own question.