17
$\begingroup$

I've been using the formula for the arithmetic mean all my life, but I'm not sure why it works.

My current intuition is this one:

The arithmetic mean is a number that when multiplied by the number of elements, gives you the sum of all the elements. Because of this fact, it can't be more than the maximum nor less than the minimum, and it should be located somewhat around the center.

But I was wondering if there are other intuitions out there? Why does this formula work? If in passing you could talk about the weighted average as well, that would be nice too.

Thanks

$\endgroup$
16
  • 30
    $\begingroup$ What does "Why does it work?" mean here...? $\endgroup$
    – Pedro
    Commented Sep 7, 2014 at 2:11
  • 2
    $\begingroup$ Your question is hard to understand. "I guess it means proof." I don't understand what you mean by this. Proof of what? What works? $\endgroup$
    – Pedro
    Commented Sep 7, 2014 at 2:20
  • 6
    $\begingroup$ Suppose you're the person that came up with this formula. What thought process occurred? $\endgroup$
    – DLV
    Commented Sep 7, 2014 at 2:22
  • 3
    $\begingroup$ What do you mean by "work"? What are you wondering why it accomplishes? The best guess I can come up with is that you're wondering why multiplying it by the number of elements gets you the sum of the elements, but I don't think that's it, because that's blindingly obvious from the formula. $\endgroup$ Commented Sep 7, 2014 at 5:51
  • 5
    $\begingroup$ @David: I think you are missing the point...the arithmetic mean is a statistical tool, it is not something that can be "proved". $\endgroup$
    – fretty
    Commented Sep 7, 2014 at 8:23

6 Answers 6

42
$\begingroup$

The simplest way to explain arithmetic mean is in terms of "equal sharing":

Abe has 12 cookies, Brianna has 8 cookies, and Chuck has 7 cookies. If they were to redistribute them so that they all have the same amount, how many would each get?

Obviously the way you answer this question is to find the total amount of cookies ($12 + 8 + 7 = 27$) and then divide the cookies among the three people ($27/3 = 9$). That's precisely what the computation of arithmetic mean does.

Is that what you're looking for?


Edited to add:

Here's another viewpoint that might help. We would like to find some number $N$ that is in the "middle" of the set $ \{12, 8, 7 \}$ (using the same numbers from the example above). What does "in the middle" mean? Well, one way to interpret this vague phrase is to imagine that we already had such an $N$ in hand, and we compute the three quantities $12-N, 8-N,7-N$. These three quantities tell us how far $N$ is from each of three pieces of information -- call these the "deviations".

What if we made a bad choice of $N$? For example, if each of the three deviations were positive, then that would mean that $N$ is smaller than each of the three original numbers, which we don't want. If each of the three deviations were negative, then that would mean that $N$ is larger than each of the three original numbers -- again bad. For $N$ to be in the middle, we would want some of the deviations to be positive and some of them to be negative. In fact, if we could choose $N$ so that the positive deviations exactly cancel out the negative deviations, then we will feel like we've really found the "middle".

Let's translate that now into a computation. We want to find $N$ such that $$(12-N) + (8-N) + (7-N) = 0$$ If you now think about what it would take to solve this equation, you will quickly realize that you end up adding the three numbers in your dataset together and then dividing by 3.

$\endgroup$
7
  • 1
    $\begingroup$ I like your answer very much, but how does "equal sharing" represent a 'central tendency' ? Thanks. $\endgroup$
    – DLV
    Commented Sep 7, 2014 at 2:40
  • 6
    $\begingroup$ The result you get from redistributing is bound to be somewhere in the middle, because those who have more end up giving their surplus to those who have less. $\endgroup$
    – mweiss
    Commented Sep 7, 2014 at 2:42
  • $\begingroup$ Wow. Thanks a lot for the edit portion, it makes tons of sense. I was wondering how this fits in with the weighted average equation. Any ideas? Thanks a lot. $\endgroup$
    – DLV
    Commented Sep 7, 2014 at 3:00
  • 4
    $\begingroup$ A weighted average is just when instead of "Abe has 12 cookies, Brianna 8, and Chuck 7", it's "In a large group, 30% of the kids have 12 cookies, 50% of the kids have 8 cookies, and 20% have 7 cookies". $\endgroup$
    – mweiss
    Commented Sep 7, 2014 at 3:31
  • $\begingroup$ Beautiful explanation in the edit. $\endgroup$
    – layman
    Commented Nov 20, 2016 at 13:42
28
$\begingroup$

Let's take a step back. Forget you ever learned about the arithmetic mean.


Let's say you have a list of numbers. A natural question is: what is the center of this list?
To answer that, you have to ask yourself: what is a "center" in the first place?
Why, for example, is 9 not the center of the numbers {1, 2, 4, 8}?

If you think about it for a while, you will realize that the center of a list of numbers it the number $\bar x$ whose total distance from the all the numbers $x_k$ in the list is minimum.
So that means you want to minimize $\sum_k \lVert x_k - \bar x \rVert$.

But how do you define $\lVert x \rVert$? A natural definition is $|x|$.
When you define it like that, you get $\bar x = $ the median. Why? Try a simple example on a piece of paper to see it visually -- the left and right side penalties cancel at the median:
enter image description here
Also notice that when there are an even number of elements, any element in the interval that holds the two middle elements is "a median". However, by taking an upper limit, you can find a single value rather than an interval -- which in this case is 8/3.

But you can also define $\lVert x \rVert =|x|^2$. In that case, you get $\bar x = $ the arithmetic mean:
enter image description here

Why is this the arithmetic mean? The formula for this should explain:
If you have $\bar x = \arg \min_x \sum_k |x_k - x|^2$, then you can set its derivative to zero:

$$\frac{d}{d\bar x}\sum_k |x_k - \bar x|^2 = 0$$ $$\sum_k 2 (x_k - \bar x) = 0$$ $$\sum_{k=1}^n x_k = n \bar x$$ $$\bar x = \frac{1}{n} \sum_{k=1}^n x_k$$

Notice this is exactly the arithmetic mean?

This is exactly why the arithmetic mean is a poor measure of central tendency.

It penalizes for deviations quadratically rather than linearly.
However, it's easy to compute (try the same thing for the median to see what I mean), and has the nice property that (by definition) multiplying it by $n$ gives you the total sum.
So people use it anyway, even when it's not the right choice.
But when is it the right choice?
It's the right choice when you're looking for the "average" dependent variable rather than the "average" independent variable, so to speak.
For example, if you're looking at the wealth of the average person, then you need to look at the median wealth. This is -- by definition -- useful for understanding how wealthy the average person is. But if you're trying to understand what's happening to the wealth itself rather than the people -- i.e., you want to know the average wealth of a person -- then you need to look at the mean wealth.

Now what if we go further? We've tried $\lVert x \rVert = |x|^1$ (the median) and $\lVert x \rVert = |x|^2$ (the mean).

What if we try $\lVert x \rVert = |x|^0$? If we do, we get back the mode, assuming we define $0^0$ to be $0$ (we have to take a limit here to see what happens):
enter image description here enter image description here

What if we try $\lVert x \rVert = |x|^\infty$? In this case we get back the midpoint -- that is, the average of the minimum and the maximum values (again, we have to take a limit to see what happens):
enter image description here

It should make sense why all of these are said to measure "central tendency". :)

$\endgroup$
6
  • $\begingroup$ "However, it's easy to compute (try the same thing for the median to see what I mean)." The median can be computed in linear time just like the mean en.wikipedia.org/wiki/Selection_algorithm. Of course you are correct in the sense that the optimal algorithm for mean is much more obvious. $\endgroup$ Commented Sep 8, 2014 at 3:29
  • 5
    $\begingroup$ @DanBrumleve: When I said "easy" I was not referring to time complexity. I was merely making a practical point that applied just as well a century ago. The ease of computation is the same reason why other techniques such as least-squares fitting are used, too. The computer science is irrelevant here. Nevertheless, computing the median is a tougher computational problem as well. No one knows of an algorithm for computing it in a streaming fashion using sublinear space, for example, whereas you can compute the mean on-line with logarithmic space. It's indeed an easier computational problem. $\endgroup$
    – user541686
    Commented Sep 8, 2014 at 4:18
  • 2
    $\begingroup$ "Forget all you know about AM" and then all that ... $\endgroup$ Commented Sep 8, 2014 at 7:03
  • 1
    $\begingroup$ Great answer, your insights about central tendency are very interesting. $\endgroup$ Commented Jan 14, 2018 at 12:35
  • 1
    $\begingroup$ @IbrahimNajjar: Great questions, sure thing. In the first diagram, I'm plotting y = $\sum_{k=1}^4\,\frac{1}{4}|x_k - x|^d$ for $d = 1$ and different values of $x$, where $x_k \in \{1, 2, 4, 8\}$. Thus at $x = 3$ we have $y(3) = (2 + 1 + 1 + 5)/4 = 2.25$. You would get the same shape if you take out the $1/4$ factor, just scaled up. (Sorry it's confusing though - I should've mentioned the scaling.) I'm taking $\lim_{d \to 1^+}$. Regarding the closed-form equation for the median, it's quite non-obvious, but very elegant; see here. $\endgroup$
    – user541686
    Commented Apr 4, 2022 at 20:55
11
$\begingroup$

Graph of the Arithmetic, Geometric and Harmonic Means

AMGMHM

The Arithmetic and Geometric Mean within a Semi Circle

AMGM

Arithmetic Mean

In the above image, the arithmetic mean finds the mid point of the total sum because it's dividing by two. In other words, it's finding half of the total sum because there are only two values in the sum.

In general, the arithmetic mean divides the total sum into equal parts, regardless of how different each value is. This is represented mathematically as $$ \overline{x}=\frac{1}{n}\sum_{k=1}^n x_k=\frac{1}{n}x_1+\frac{1}{n}x_2+\dots +\frac{1}{n}x_n $$ So for example, if we want to find the arithmetic mean of $\{3, 60, 900\}$, then $$ \overline{x}=\frac{1}{3}\sum_{k=1}^3 x_k=\frac{1}{3}3+\frac{1}{3}60+\frac{1}{3}900=321 $$ Where $321$ represents one third of the value of the sum. Also notice that in cases such as this one, the arithmetic mean can be heavily influenced by one value that is much larger or smaller than the rest. For this reason, the arithmetic mean is not considered a robust statistic.

Weighted Mean

The weighted mean for the values $\{x_1, x_2, \dots, x_n\}$ and the weights $\{w_1, w_2, \dots, w_n\}$, is expressed mathematically as $$ \overline{x}=\frac{\sum_{k=1}^n w_kx_k}{\sum_{k=1}^n w_k}=\frac{w_1x_1+w_2x_2+\cdots +w_nx_n}{w_1+w_2+\cdots +w_n} $$ So for the weighted mean of $\{3, 60, 900\}$ with weights $\left\{\frac{6}{9}, \frac{2}{9}, \frac{1}{9}\right\}$, we have $$ \overline{x}=\frac{\frac{6}{9}3+\frac{2}{9}60+\frac{1}{9}900}{\frac{6}{9}+ \frac{2}{9}+ \frac{1}{9}}=2+13.\overline{3}+100=115.\overline{3} $$ Notice that the arithmetic mean can also be generalized to be a weighted mean where every value has an equal weight of $\frac{1}{n}$. As seen in the above example, the selected weights can have a huge impact on the result.

$\endgroup$
0
6
$\begingroup$

The simplest idea behind it might be an answer to the question "if everyone had an equal share, how much would they have?"

The etymology of the word 'average' suggests this sharing is the fundamental idea, since it came to have its current mathematical meaning from calculating each party's share of the loss when cargo was damaged/lost at sea. The idea of general average describes the process of distributing the value of a maritime loss in proportion to one's cargo at risk. So this involves a weighted average calculation.

So the fact that the average times the number of elements equals the total, was the original motivation for it's calculation.

$\endgroup$
3
$\begingroup$

Well, it's a mean, so it's value is between the min and max, it's analytically very nice (differentiable, etc.), and it's reasonably easy to compute quickly and accurately (see Kahan).

It also is the simplest of the power means ($p=1$).

Also, it is friendly and slow to anger, unlike some mean means.

$\endgroup$
-1
$\begingroup$

Average is single measure of a collection of things, a single overview of the data we have. like in the case, if me and my three friends go in search for apples in the woods and we get back with 4,3,4 and 5 apples. And we now determine to share it among us we'd definitely take an average, meaning sum the quantities and divide among four of us. then we will all have 4 apples per person.

Average is a measure to get birds eye view of the data. Now going to weighted average, here too we are weighing the quantity to the total. in the case of finding expected means we are multiplying the ratio of frequency of occurrence of a particular event to the total as to measure the mean of expected value/ mean of the random variable. i.e weighted average measure the frequency as ratio to the total.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .