1
$\begingroup$

I'm writing some code which calculates some averages. Obviously, the traditional way to calculate any average is to add up all the values, and then divide by the number of values.

However, in the mechanism I'm working on, I find it much easier to add and calculate the averages one at a time, as in add a new value to the averaged total, and then divide by two each time (since each time there are two numbers being added). But I'm not sure how accurate it would be.

Can I calculate averages this way? Or is it not reliable?


NOTE: I began writing this question originally, and while coming up with an example, found my answer. So I added an answer with my question at the same time.

$\endgroup$
10
  • $\begingroup$ Unfortunately, it's clear that you wrote the question after exploring/grasping/finding the answer. So the question strikes many of us as disengenuous, just so you can answer it. That might explain the downvotes. $\endgroup$
    – amWhy
    Commented Jan 14, 2017 at 22:04
  • 2
    $\begingroup$ @amWhy Actually first of all, I discovered my answer while writing the question. Second of all, actually Q/A style is strongly encouraged on all Stack Exchange sites. That's why they have the option to ask a question and answer it at the same time. See here: meta.math.stackexchange.com/questions/11832/… $\endgroup$ Commented Jan 14, 2017 at 22:05
  • $\begingroup$ Well, I'm glad that writing the question was a trigger for an "aha!" moment. (I have done a lot of that in my life...trying to express, in writing, where I was stuck/clueless only to find those "aha!" moment.) I just noticed that the question and answer were posted almost simultaneously. And, FWIW, there is no consensus, at least on this site, regarding the approval of posting a question, and immediately posting an answer. In any case, please don't blame me for being the bearer of what you might find unpleasant. $\endgroup$
    – amWhy
    Commented Jan 14, 2017 at 22:10
  • $\begingroup$ For memory purposes, you might find it useful to store the number of entries and the running average. Thus if $a_n$ is the average of the first $n$ terms, we get $a_{n+1}=\frac 1{n+1}\times \left(na_n+S_{n+1}\right)$ where, of course. $S_i$ denotes the $i^{th}$ term in your data. $\endgroup$
    – lulu
    Commented Jan 14, 2017 at 22:11
  • 1
    $\begingroup$ @amWhy All Stack Exchange sites have a checkbox at the bottom of the question page to answer your own question at the same time as asking. They were literally posted at precisely the same time as each other. Other SE sites are perfectly okay with doing so. This is the first SE site which someone has said I shouldn't. $\endgroup$ Commented Jan 14, 2017 at 22:11

2 Answers 2

0
$\begingroup$

Here is a reference of a quite accurate algorithm for getting the average of a series of floating point numbers:

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

If you also want to compute the variance, look here:

https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

A good reference is this:

https://people.eecs.berkeley.edu/~wkahan/Math128/MeanVar.pdf

$\endgroup$
0
$\begingroup$

(I'm answering my question Q/A style)

Imagine this set of numbers:

1, 2, 3, 4, 5

The traditional method is:

((1 + 2 + 3 + 4 + 5) / 5) = 3

What you are proposing is:

((1 + 2) / 2) = 1.5
((1.5 + 3) / 2) = 2.25
((2.25 + 4) / 2) = 3.125
((3.125 + 5) / 2) = 4.0625

So no, this proposed method of calculating an average does not work.

3 <> 4.0625

Not to mention, even if it did work, it would be much slower anyway.


What you could do instead is to continue adding the values together, and elsewhere keep track of the number of values. Each time you add a value, also increment the number of values added. Then, at any given point, you are able to calculate the average...

(1 + 2) = 3       C = 2     (3 / 2) = 1.5
(3 + 3) = 6       C = 3     (6 / 3) = 2
(6 + 4) = 10      C = 4     (10 / 4) = 2.5
(10 + 5) = 15     C = 5     (15 / 5) = 3

In any case, the moral of the story is that you still need to add all the values together, and then divide by the number of values. You can keep a sum of all values, but you also need to keep a count of the values which have been added to that sum.

$\endgroup$
3
  • 3
    $\begingroup$ no, you don't need to wait. make it this way: (1+2)/2=1.5. (1.5$\cdot$2+3)/3=2. Or in general (old average times (n-1)+new number)/n. Where n is the counting variable. But this is certainly not efficient. $\endgroup$
    – SAJW
    Commented Jan 14, 2017 at 23:16
  • $\begingroup$ @Socrates Indeed, edited to be more clear and generally based. $\endgroup$ Commented Jan 15, 2017 at 3:15
  • $\begingroup$ @SAJW good solution, this also solves the issues where we might overshoot the storage capacity of the variable (example a UINT going past 65535) $\endgroup$
    – klonq
    Commented Feb 22, 2022 at 9:21

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .