44
$\begingroup$

I have come across this text recently. I was confused, asked a friend, she was also not certain. Can you explain please? What author is talking about here? I don't understand. Is the problem with the phrase "on average"?

Innumerable misconceptions about probability. For example, suppose I toss a fair coin 100 times. On every “heads”, I take one step to the north. On every “tails”, I take one step to the south. After the 100th step, how far away am I, on average, from where I started? (Most kids – and more than a few teachers – say “zero” ... which is not the right answer.)

In a way it is pointless to talk about misconceptions, when you don't explain the misconceptions...

Source: https://www.av8n.com/physics/pedagogy.htm Section 4.2 Miscellaneous Misconceptions, item number 5

$\endgroup$
18
  • 16
    $\begingroup$ The distance from where you start is the absolute difference. That is, if you are moving on the number line, and you are at $-3$, the distance from $0$ is $3$. Hence the answer is clearly not $0$ here. $\endgroup$
    – lulu
    Commented Jul 23, 2023 at 23:34
  • 17
    $\begingroup$ If the question asked for the expected position then the common misconception would be correct. I would classify the "misconception" as a misinterpretation of the problem. $\endgroup$
    – John Douma
    Commented Jul 23, 2023 at 23:39
  • 35
    $\begingroup$ @Masacroso No. We are not talking about the expected location after $n$ steps, we are talking the average distance. At step 2, for example, you could be 2 steps away (TT or HH), or you could be at your starting location (HT or TH). All four outcomes are equally likely, hence the expected distance is $(2+2+0+0)/4 = 1$ unit from the starting location. More generally, after $n$ steps, there is a non-zero chance of being somewhere other than the start, hence a non-zero (positive) expected distance from the start. $\endgroup$
    – Xander Henderson
    Commented Jul 23, 2023 at 23:46
  • 28
    $\begingroup$ "Do I have a misconception about probability?" - probably ;-) $\endgroup$
    – Falco
    Commented Jul 24, 2023 at 10:14
  • 14
    $\begingroup$ I think "how far away am I, on average" is a colloquial phrasing that's reasonable to interpret as meaning the second thing, and not the first thing. But I don't like using it as a "gotcha" when the first possible meaning is there to confuse things. $\endgroup$ Commented Jul 24, 2023 at 20:03

10 Answers 10

38
$\begingroup$

Since the distance can't be negative, the average distance is strictly greater than zero when at least once you don't land on zero.

Let's say you land 2 steps above zero on the first try and 2 steps beneath zero on the second try. Then on average you will have thrown heads and tails equally many times, but the average distance from zero is 2.

How you can see that is as follows: suppose you start with a variable $x$ which is initially 0 and every time you throw heads, you add 1, and every time you throw tails, you subtract 1.

Then the average value of $x$ after 100 throws is 0, since they expect you to throw as many heads as tails on average.

But the average distance is the average value of $|x|$ after 100 throws. Throwing a lot of heads once in one "run" and throwing a lot of tails in another "run" cancels in the average value, but adds to the average distance.

$\endgroup$
19
  • 17
    $\begingroup$ But in the text, it says, "how far away am I, on average, from where I started" not "how far have I travelled", which I would translate to the expected location, in absolute value. I would take the absolute value of the average, not the average of the absolute value $\endgroup$ Commented Jul 24, 2023 at 9:42
  • 10
    $\begingroup$ @AnderBiguri: Not necessarily - whether you're two steps north, or two steps south, of your starting location, you're precisely 2 steps away from it. (If you were two steps due east of your start point, you'd still be 2 steps away, not zero!) But you're right that it's a matter of how the English description is interpreted in mathematical terms. $\endgroup$
    – psmears
    Commented Jul 24, 2023 at 9:46
  • 50
    $\begingroup$ I would add to the answer: This is not really a misconception of probabilities, this is mostly a misunderstanding of "distance" vs "location" in the question. It is a little like a trick question like "what is heavier a ton of feathers or a ton of lead" $\endgroup$
    – Falco
    Commented Jul 24, 2023 at 10:18
  • 6
    $\begingroup$ @AntiHeadshot, that's not how expected value works. It's like saying that expected value of normal distribution $\mathcal N(\mu,\sigma^2)$ can't be $\mu$ since the probability that $\mu$ occurs exactly is $0$. $\endgroup$
    – Ennar
    Commented Jul 24, 2023 at 11:27
  • 11
    $\begingroup$ @Falco, you are right that the probability of position to be $0$ is the biggest, but it's not true that the probability that distance is $0$ is the biggest. The probability that the distance is $0$ is $\frac 1{2^{100}}\binom{100}{50} \approx 0.0796$, while the probability that the distance is $2$ is $2\cdot\frac 1{2^{100}}\binom{100}{49}\approx 0.1561$. Intuitively, probability that we get $50$ heads is approximately the same as the probability that we get $51$ or $49$ heads, so the probability that the distance is $2$ is approximately double the probability that the distance will be $0$. $\endgroup$
    – Ennar
    Commented Jul 24, 2023 at 14:22
24
$\begingroup$

The average distance (in steps) from where you started is approximately $\sqrt{\frac{200}{\pi}}\approx 7.978845608$ and as the number of steps $N$ tends to infinity is asymptotic to $\sqrt{\frac{2N}{\pi}}$.

The "misconception" here is a confusion between the expectation of $S_N=\sum_{i=1}^NX_i$ with $X_i$ independent random variables taking values $\{-1,1\}$ with equal probabilities $\frac{1}{2}$, and the expectation of $|S_N|$; i think this confusion is highly prevalent because we instinctively visualize position and its average, rather than distance, especially because here the symmetry of the random walk draws us to the simple "symmetric" value, $0=-0$. There is another closely related value, simpler to calculate and in some ways more natural: the squareroot of the expectation of $S_N^2$, that is the standard deviation $\sigma$ of $S_N$, as $ES_N=0$.

There are $2^N$ possible paths after N steps, each with probability $2^{-N}$, thus as noted in other answers since all values of the random variable $|S_N|$ are nonnegative, it suffices to find one that is positive to prove that the average distance $E|S_N|>0$. You may take the path $\omega=(1,1,1,...,1,1)$, "all steps to the north": the final distance from where you started is $N$ which contributes $\frac{N}{2^N}$ to the expectation, so $E|S_N|\geq\frac{N}{2^N}>0$.

Hölder's inequality implies that the variance $\sigma^2_{S_N}=ES_N^2\geq (E|S_N|)^2$; and actually $ES_N^2=N$ (by independence of the $X_i$ and $\sigma_{X_i}^2=1$) while as $N\rightarrow\infty$, $(E|S_N|)^2\sim\frac{2N}{\pi}<N$.

The exact formula -whose proof you can find in https://mathworld.wolfram.com/RandomWalk1-Dimensional.html - for $E|S_N|$ is $\frac{(N-1)!!}{(N-2)!!}$, for $N$ even, and $\frac{N!!}{(N-1)!!}$ for $N$ odd. And both formulas have the same asymptotic, given above. As commented by Džuris, the average distance increases only when taking an odd-numbered step -see my reply for a direct proof, not relying on the exact formula, which is not trivial to arrive at.

A decimal approximation of the value (rather than of the asymptotic estimate given at the beginning of this answer) is $E|S_{100}|\approx 7.95892373871787614981270502421704614$, and of course the standard deviation of $S_{100}$ is $10$.

$\endgroup$
7
  • 2
    $\begingroup$ Wow, so the even numbered steps don't change the expected distance. Makes sense when you think a bit, but it surprised me! $\endgroup$
    – Džuris
    Commented Jul 25, 2023 at 10:14
  • $\begingroup$ @Džuris thank you, good remark, i should have highlighted it. This follows from the explicit formula, but we can see it directly: note that if after $N$ steps you stand at $0$ then you can only increase your distance from it at the next step, but if after $N$ steps you stand away from $0$ then you have as much probability to get closer to it as to get farther from it. Now we can only come back to $0$ after an even number of steps $N$, thus $E|S_{N+1}|>E|S_N|$ only for even $N$. $\endgroup$
    – plm
    Commented Jul 25, 2023 at 11:26
  • 3
    $\begingroup$ Hi @galaxy-- , you can find the derivation of the exact formula in the link to Mathworld i gave above, it uses the (Legendre) multiplication formula for the Gamma function. Then you have a quotient of Gamma functions with a factor $\frac{2}{\sqrt{\pi}}$ whence you get the estimate either with an asymptotic expansion of Gamma functions or via Stirling's formula. I do not think there is a name for the asymptotic but it is related to "projection constants", as you may find in the references -eg ams.org/journals/tran/1960-095-03/S0002-9947-1960-0114110-9/…. $\endgroup$
    – plm
    Commented Jul 25, 2023 at 23:42
  • 1
    $\begingroup$ @plm: thank you very much for your detailed answer. $\endgroup$
    – galaxy--
    Commented Jul 27, 2023 at 23:33
  • 1
    $\begingroup$ The asymptotic $\sqrt{\frac{2N}{\pi}}$ is also the mean of the corresponding half-normal distribution with scale parameter $\sqrt{N}$, related to the central limit theorem applied to the signed distance $\endgroup$
    – Henry
    Commented Dec 6, 2023 at 18:24
22
$\begingroup$

I suspect it comes from the following plausible-sounding but flawed reasoning:

(1) the average final position is the starting point (correct)

(2) so the distance from the starting point to the average final position is zero (correct)

(3) and average distance equals distance to the average (incorrect! the former is actually a much more complicated thing, in this case, than the latter, and they are not equal)

(4) therefore, the average final distance is zero (incorrect)

$\endgroup$
1
  • 2
    $\begingroup$ Yes, I think this is the key point. It is a very common error to think that it doesn't matter at what point they take an average, and completely fail to notice that the answer they get is unreasonable. I think you would get a decent amount of people who know that the distance can't be negative and yet are happy with an argument that the average is 0, not realising that would imply it is always 0. $\endgroup$ Commented Jul 26, 2023 at 7:52
12
$\begingroup$

Is the problem with the phrase "on average"?

Perhaps. You might also be jumping to conclusions about what's being asked. The question says:

After the 100th step, how far away am I, on average, from where I started?

I'm sure you know that to calculate an average, you add all the samples and divide the total by the number of samples. Since the coin is fair, your intuition might tell you that since heads and tails occur with equal frequency, they should cancel each other out and the average distance from the starting point will be zero. However, if you think about all the possible outcomes that you could get from tossing the coin 100 times, the only ones that result in a distance of 0 steps from the start are the ones where there are exactly 50 heads and 50 tails. All the outcomes with more heads than tails or vice versa put you some distance greater than 0 steps from the start, so the average distance has to be more than 0 steps.

Here's a table with all the possible outcomes for just 4 coin tosses, along with their average distance from the start:

outcomes from 4 coin tosses

As you can see, if you toss a coin 4 times and take 1 step north on heads and 1 step south on tails, on average you'll end up 1.5 steps from the start.

$\endgroup$
1
  • 2
    $\begingroup$ As a non-mathematician, I appreciate the clarity and the use of a concrete example to highlight the misconception. $\endgroup$
    – screwtop
    Commented Jul 27, 2023 at 3:03
9
$\begingroup$

I think the idea here is that most people believe that since you are throwing coin a "large" number of time ($100$ is not large by the way), the heads and tails should "average out" and we should thus be close to the initial point, and that the distance to initial point is $0$.

However, the problem is that this is not how it turns out. What is true is that the ratio between the distance to origin and the total distance walked is trending to $0$ as the number of coin tosses increases. However the average distance to the origin (which can take all values from $0$ to $100$ in our case), is then not $0$.

$\endgroup$
7
$\begingroup$

Flip two coins and walk a mile north for each heads, and a mile south for each tails.
How much is your average cab-fare home?

Phrased this way, we know the answer isn't "zero" because no cabby is going to pay you for walking south instead of north. Doing the math, a quarter of the time you end up 2-miles north, a quarter of the time you end up 2-miles south, and the other half the times you've already made it home; altogether that amounts to 1-mile worth of cab-fare on average. That's not so hard, right?

So what's with the stupid "How far away am I, on average, from where I started?... (Zero is wrong...)" wording? If they didn't want to know 'where' I end up on average, they should have asked 'how far'... umm, I mean... well, that's just annoying now isn't it.


And that's why it was worded that way! The lesson isn't supposed to be "And now class, we define a single pedantic definition of what 'How far away?' is supposed to mean!" but rather the lesson is that everyday language can sometimes be vague and/or imprecise.

Or at least, that's the usual place where I see this example: in distance-versus-displacement pedagogy.

Connecting this example to "misunderstanding probability" is somewhat unusual, but it's not entirely incorrect either. In one way it illustrates how, just because you can crunch numbers and output a statistic doesn't necessarily mean that you have actually produced the useful data that you think you wanted.

How so? Well, we've determined the average displacement and average distance for this example: on average you end up back where you started, but on average you spend 1-mile worth of cab-fare. If we didn't just work through the math and know exactly what each of those two statistics signified and how they do/don't relate, we might have looked at those numbers and thought... "Why am I paying a cabby for 1-miles on average when I'm 0-miles from home on average?" or even worse "Why on Earth am I paying somebody to drive me 1-mile away from home!?". Both conclusions are total nonsense and we know why (having done all that the math above) but these kinds of conclusions get made quite easily. Knowing that calculating average cab-fare required the average-distance statistic (rather than assuming that average-displacement was the 'same thing') was vital to understanding the whole situation.

Similarly, problems involving combinatorics, probabilities, and statistics will rely on a chain of logical steps being accounted for properly. What if we hadn't properly accounted for TH and HT separately? We would have thought our average cab ride was (2+0+2)/3=1.33-miles long! Probability problems can be notoriously tricky for new students precisely because it can be hard to properly account for every possibility when you barely know what adding versus multiplying probabilities means and you're just winging it.

$\endgroup$
6
$\begingroup$

We can model taking a step to north or south as discrete random variable $X_n\sim\begin{pmatrix}-1 & 1\\ \frac 12 & \frac 12\end{pmatrix}$. We can then look at random variable $S_n = X_1+X_2+\ldots + X_n$ which will tell us our position after $n$ coin tosses.

If you look at $E(S_n)$, then it is indeed $0$ because $E(X_i) = 0$ and expected value is linear. However, if we want to look at distance from the origin, we should look at random variable $|S_n|$ instead.

Let me assume that $n$ is even, since in the original question we have $n = 100$. The case when $n$ is odd is dealt with similarly. The reason that we consider parity is because $S_n$ is the sum of $n$ odd numbers, so the parity of $S_n$ is the same as $n$. We can look at the probability that $S_n = 2k$, $k = -n/2, -n/2 + 1, \ldots, n/2$. This will happen exactly when there are $n/2+k$ coin tosses that result in heads and $n/2-k$ coin tosses that result in tails. Thus, the probability is $$P(S_n = 2k) = \frac 1{2^n}\binom n{n/2+k}.$$ The expected value of $S_n$ is then $$E(S_n) = \sum_{k=-n/2}^{n/2}\frac 1{2^n}\binom n{n/2+k}\cdot 2k$$ which we can confirm is $0$ by noting that $\binom n{n/2+k} = \binom n{n/2-k}$, so $2k$ and $-2k$ occur equally likely.

The expected value of $|S_n|$, on the other hand is $$E(|S_n|) = \sum_{k=-n/2}^{n/2}\frac 1{2^n}\binom n{n/2+k}\cdot |2k|$$ which is strictly greater than $0$ since $\frac 1{2^n}\binom n{n/2+k}\cdot |2k|>0$ for all $k\neq 0$, and is $0$ for $k = 0$.

Plugging some values of $n$ in software, we get $E(|S_2|) = 1$ (which you can check by hand to convince yourself of the formula), $E(|S_4|) = 1.5$ and $E(|S_{100}|)\approx 7.96.$

I would say the point of the exercise is to understand the difference between $|E(X)|$ and $E(|X|)$, which are not the same in general. In fact $|E(X)|\leq E(|X|)$ is an instance of triangle inequality.


The reason that you don't answer to the question "how far am I from home on average" by looking at $S_n$, its expected value and taking absolute value of that is, intuitively, the same reason no one would say the following: "Yesterday I walked two kilometers north from home and today I walked two kilometers south from home. On average, I was at home." One would instead say: "Yesterday I walked two kilometers north from home and today I walked two kilometers south from home. On average, I walked two kilometers from home."

$\endgroup$
1
  • $\begingroup$ You can replace $1/2^n$ with $p^{n-2k} (1-p)^{2k}$ if you want a generalized version of the above with a skewed coin with probability $p\in(0,1)$ for heads. $\endgroup$
    – Therkel
    Commented Jul 27, 2023 at 7:03
3
$\begingroup$

Not right, but perhaps thinking of it this way will help you grasp the problem — you might think of this as where the most people end up, versus the average difference.

Suppose that 98 out of 100 end back where they started, and 1 person was 1 step north, 1 person was 1 step south. That’s still 2% of the total being off, making the average distance be .02 steps. And despite the fair toss, out of a hundred you’re not going to end up with 98 of them back on zero.

$\endgroup$
2
$\begingroup$

The question tries to be kind to the reader by using non-technical language. "How far way am I?", is intended as "what is my positive distance from?", or "how far must I still walk?" or some technical equivalent involving $L_1$ norms. A non-technical reader can easily misinterpret it as "How far North am I?", so the kindness was misplaced. The reader was just confused. There are reasons why professionals use precisely-defined language, and there are many pitfalls involved in communicating with non-professionals

$\endgroup$
0
$\begingroup$

Here is my approach.

Let $X_1,...,X_{100}$ be independent random variables that possess a uniform distribution on $\{-1,1\}$

You're looking to find the expected value of $|X_1+\dots+X_{100}|$.

Put $Y_j=\frac{1}{2}(X_j+1)$. Then $Y_j\sim \text{Bernoulli}(0.5)$ and it follows that $$\mathbb{E}(|X_1+\dots +X_{100}|)=\mathbb{E}\left(|2Y-100|\right)$$

where $$Y=Y_1+\dots+Y_{100}\sim \text{Binom}(100,0.5)$$ Indeed, with the law of the unconscious statistician, the expected value equals $$\sum_{k=0}^{100}|2k-100|{100 \choose k}(0.5)^{100}$$ which is around $7.96$

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .