Let's take a step back. Forget you ever learned about the arithmetic mean.
Let's say you have a list of numbers. A natural question is: what is the center of this list?
To answer that, you have to ask yourself: what is a "center" in the first place?
Why, for example, is 9 not the center of the numbers {1, 2, 4, 8}?
If you think about it for a while, you will realize that the center of a list of numbers it the number $\bar x$ whose total distance from the all the numbers $x_k$ in the list is minimum.
So that means you want to minimize $\sum_k \lVert x_k - \bar x \rVert$.
But how do you define $\lVert x \rVert$? A natural definition is $|x|$.
When you define it like that, you get $\bar x = $ the median.
Why? Try a simple example on a piece of paper to see it visually -- the left and right side penalties cancel at the median:
![enter image description here](https://cdn.statically.io/img/i.sstatic.net/MfArW.png)
Also notice that when there are an even number of elements, any element in the interval that holds the two middle elements is "a median". However, by taking an upper limit, you can find a single value rather than an interval -- which in this case is 8/3.
But you can also define $\lVert x \rVert =|x|^2$. In that case, you get $\bar x = $ the arithmetic mean:
![enter image description here](https://cdn.statically.io/img/i.sstatic.net/LC6BK.png)
Why is this the arithmetic mean? The formula for this should explain:
If you have $\bar x = \arg \min_x \sum_k |x_k - x|^2$, then you can set its derivative to zero:
$$\frac{d}{d\bar x}\sum_k |x_k - \bar x|^2 = 0$$
$$\sum_k 2 (x_k - \bar x) = 0$$
$$\sum_{k=1}^n x_k = n \bar x$$
$$\bar x = \frac{1}{n} \sum_{k=1}^n x_k$$
Notice this is exactly the arithmetic mean?
This is exactly why the arithmetic mean is a poor measure of central tendency.
It penalizes for deviations quadratically rather than linearly.
However, it's easy to compute (try the same thing for the median to see what I mean), and has the nice property that (by definition) multiplying it by $n$ gives you the total sum.
So people use it anyway, even when it's not the right choice.
But when is it the right choice?
It's the right choice when you're looking for the "average" dependent variable rather than the "average" independent variable, so to speak.
For example, if you're looking at the wealth of the average person, then you need to look at the median wealth. This is -- by definition -- useful for understanding how wealthy the average person is. But if you're trying to understand what's happening to the wealth itself rather than the people -- i.e., you want to know the average wealth of a person -- then you need to look at the mean wealth.
Now what if we go further? We've tried $\lVert x \rVert = |x|^1$ (the median) and $\lVert x \rVert = |x|^2$ (the mean).
What if we try $\lVert x \rVert = |x|^0$? If we do, we get back the mode, assuming we define $0^0$ to be $0$ (we have to take a limit here to see what happens):
![enter image description here](https://cdn.statically.io/img/i.sstatic.net/Z2iTo.png)
What if we try $\lVert x \rVert = |x|^\infty$? In this case we get back the midpoint -- that is, the average of the minimum and the maximum values (again, we have to take a limit to see what happens):
![enter image description here](https://cdn.statically.io/img/i.sstatic.net/2Q9PK.png)
It should make sense why all of these are said to measure "central tendency". :)