26
$\begingroup$

is it possible to calculate the regular average of a sequence of numbers when i dont know everything of the sequence, but just everytime i get a new number i know the total count of numbers and the average for the numbers - 1.

for example: 2 3 10 the average is of course: 5

but in the last step to calculate i only have access to the previous average of 2 and 3: 2.5 the next number: 10 and the count of numbers: 3

if this is possible, how?

$\endgroup$

3 Answers 3

43
$\begingroup$

Yes, and you can derive it from the expression for the average. Let the average of the first $n$ numbers be $\mu_n$. The formula for it is

$$\mu_n = \frac{1}{n} \sum_{i=1}^n x_i$$

Then you can derive

$$n \mu_n = \sum_{i=1}^nx_i = x_n + \sum_{i=1}^{n-1} x_i = x_n + (n-1)\mu_{n-1}$$

and hence, dividing by $n$,

$$\mu_n = \frac{(n-1) \mu_{n-1} + x_n}{n}$$

i.e. to calculate the new average after then $n$th number, you multiply the old average by $n-1$, add the new number, and divide the total by $n$.

In your example, you have the old average of 2.5 and the third number is 10. So you multiply 2.5 by 2 (to get 5), add 10 (to get 15) and divide by 3 (to get 5, which is the correct average).

Note that this is functionally equivalent to keeping a running sum of all the numbers you've seen so far, and dividing by $n$ to get the average whenever you want it (although, from an implementation point of view, it may be better to compute the average as you go using the formula I gave above. For example, if the running sum ever gets larger than $10^{308}$ish then it may be too large to represent as a standard floating point number, even though the average can be represented).

$\endgroup$
4
  • $\begingroup$ As you're computing the previous sum $(n-1)\mu_{n-1}$ as part of the formula, I think this wouldn't help is the sum gets too large. $\endgroup$
    – danijar
    Commented Jun 23, 2016 at 1:12
  • 3
    $\begingroup$ @danijar Completely true - a better approach is to keep running sums of the $x_n$ using a method that is robust to rounding error (e.g. Kahan summation) and a separate running count of $n$, and divide whenever you need the mean. That way, your error is bounded by the accuracy of your floating point type. $\endgroup$ Commented Jun 23, 2016 at 7:44
  • 1
    $\begingroup$ If you are likely to end up with numbers larger than $10^{308}$ then you either need to scale down your inputs, or use a more capacious floating point type. $\endgroup$ Commented Jun 23, 2016 at 7:46
  • $\begingroup$ In case anyone is interested, I used this concise answer to derive a solution in JavaScript as part of another answer stackoverflow.com/a/74020136/1086398, though as above the floating points could be an issue in some scenarios. $\endgroup$
    – adsy
    Commented Dec 23, 2023 at 20:07
3
$\begingroup$

A very simple thought process results in the same formula for running average. If you have $N$ previous measures (of course the measures could all be different) the average you calculate is exactly the same as if all measures were the same as the computed average value. Then, computing the running average of the $N+1$ is equal to $N$ times the previously computed average plus the $N+1$ measure all divided by $N+1$. I know that this is the same as the formula posted in the other answer but no derivation with sums is needed or more obscure mathematical thinking is needed (OK, maybe not really obscure).

$\endgroup$
2
$\begingroup$

What you are asking for is commonly called sequential estimation. A general approach is described in [Robbins, H. and S. Monro (1951). A stochastic approximation method. Annals of Mathematical Statistics 22, 400–407.]

To add to the derivation of Chris Taylor, I personally like this rewriting as it goes quite intuitively (easy to remember). $$\mu_n = \mu_{n-1} + \frac{1}{n}(x_n - \mu_{n-1}) $$

# algorithmically: sequential average computation
avg += (x_n - avg)/n
$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .