1
$\begingroup$

Intro

Thanks for reading.

Long question incoming - I tried to make it as complete as possible to best explain where I'm coming from conceptually, as I've talked to a lot of people in my introductory probability class and we all seem to have similar confusions.

Allow me to explain:

I'm having trouble understanding how to interpret the "Expected Value" and the "Standard Deviation" of a random variable.

I thought I understood them, but after reading this following question from "The Art of Probability", I realized I still don't really understand them.

I suggest you scroll down and read the question first. Then if you want, come back up and read the sections titled "The way I used to interpret the Expected Value" and "The way I used to interpret the Standard Deviation" for more context.

Thank you!


The way I used to interpret the Expected Value:

Say we have some random variable $X$, which is associated with the outcome of some random-experiment. Different outcomes of that experiment (each with their corresponding probabilities) get mapped to numerical values by $X$.

The expected value of $X$ is the average output of $X$ after a large number of trials of the experiment have been run.

That is, if each time we run the experiment we keep track of the output of $X$ (let's call that output $x_i$ for the $i_{th}$ experiment), sum all those outputs up over a large number (say $n$) of times, and divide by $n$, we'd get $\mathrm{E}[X]$ as $n\rightarrow \infty$.

$$\mathrm{E}[X]=\frac{x_1 + x_2 + ... + x_n}{n}$$

Alright, cool!

Now, as for the Standard Deviation.


The way I used to interpret the Standard Deviation:

In a sense, the Standard Deviation told us how "expected" the Expected Value of our random variable really was.

The Standard Deviation is the square root of the variance, and the variance is the "expected" squared distance of the output of $X$ from its expected value $\mathrm{E}[X]$.

So, the Standard Deviation is the expected distance of the outputs of $X$ from $\mathrm{E}[X]$.

A small standard deviation means we actually "expected" the values of $X$ to be around $\mathrm{E}[X]$, while a large standard deviation meant we "expected" the values to be further away.

Okay...so far so good. Now on to this question which confuses the heck out of me:


The Question:

In this game, there are two players and a coin. The coin gets tossed repeatedly.

One player predicts “Heads” every time, while the other predicts “Tails.”

The player that predicts correctly gets a point, while the other loses a point.

Let $X$ be a random variable corresponding to the score of the player that chooses heads every time.

In each “trial” (toss), $X$ is $+1$ if the toss is heads, and $-1$ if the toss is tails.

We have a fair coin, so that each outcome of $X$ happens with probability $\frac{1}{2}$, making… $$E(X)=\frac{1}{2}(+1)+\frac{1}{2}(-1)=0$$ …the expected score of that player zero.

However, note that this does not mean that we expect $X$ to take on a value of 0! This only means that the average score of this player after a lot of games ends up being 0.

Or at least...so I thought. Keep on reading!

The standard deviation of the game is… $$\sqrt{1*\frac{1}{2}+1*\frac{1}{2}}=1$$ …which means we expect the value of $X$ to deviate by $1$ from its expected value in each game.

That makes sense. After all, if it deviates by $+1$ for half the games, and $-1$ for half the games, the score of our player after a bunch of games is $0$. However, in no specific game do we actually expect our player to get a score of $0$, as that's not even an option. Thus, the standard deviation is $1$ - the score we expect one of our players to get in each game.

Alright, now we get to the part that's confusing me.

Let’s say the players decide to toss the coin $n$ times. Each $i_{th}$ game has the random-variable $X_i$ associated with it, which gives us the points that the player that chooses heads gets on the $i_{th}$ toss. He gets $+1$ if its heads, and $-1$ if its tails.

Let’s define a new random variable, $Y=(X_1+X_2+X_3+⋯X_n)$.

In plain English, $Y$ is the total payoff for the player choosing heads at the end of the $n$ games.

If $Y$ is positive, the player choosing heads won, while if it’s negative, the other player won.

Let’s first calculate the expected value of $Y$. By the linearity of expected values, it’s just the sum of the expected values of each of the $X_s$.

$$\mathrm{E}[Y]=\mathrm{E}[X_1] + \mathrm{E}[X_2] + ...+ \mathrm{E}[X_n]$$ $$\mathrm{E}[Y]=0$$

Note that this does not mean that after a large number of tosses, the average score (the sum of the score in each toss for all the tosses divided by the total number of tosses) is $0$. I mean, it is $0$, but that's not what $\mathrm{E}[Y]$ is telling us.

That, described above, is the expected value of $X$!!! That average value of $X$ after many $X_s$!

The expected value of Y is calculating the average value of $Y$ after many $Y_s$!!!

In other words, the interpretation of the expected value of $Y$ is that the two players would play an entire game, with $n$ tosses, $m$ times, where $m$ is a large number.

Each play of the game, with $n$ tosses each, is considered a “trial”. Then, after a large number of “trials”, a large number of games with $n$ tosses each, the average score would be $0$.

$$\mathrm{E}[Y]=\frac{y_1 + y_2 + ... + y_m}{m}$$

In this case, it’s obvious to see that if the expected score of each of the $X_s$ is zero (the average score after a bunch of $X_s$), then the average score of $Y$ must be zero as well, since each $Y$ consists of a bunch of $X_s$ itself.

However…now comes the question: How should we INTERPRET the expected value of $Y$?

Recall that although the expected value of $X$ was $0$, a score of $0$ was never really “expected” for any given $X$…the standard deviation was $1$, meaning that we always expected the final score of any given game to be $1$ away from $0$, $1$ away from the expected value. That made sense…

Now, what's the standard deviation of $Y$?

Well, the variance of $X$ is $1$, and by the linearity of variances, the variance of $Y$ is the sum of the variances of each of the $X_s$.

$$\mathrm{Var}[Y] = 1 + 1 + 1 + .... + 1 = n$$

That means that the standard deviation of $Y$ is $\sqrt{n}$, the square root of its variance.

But...hold on a second.

The expected value of $X$ was what we expected the average score to be after a large number of tosses. For that, we got $0$.

$Y$ corresponds to a large number of tosses. To $n$ tosses.

The standard deviation of $Y$, which is $\sqrt{n}$, is how far away we expect the final score of our player choosing heads to be from the expected value of $Y$. We expect it to be $\sqrt{n}$ away from $0$.

But...aren't those two statements contradictory?

How is it that $\mathrm{E}[X]=0$, the average score of our player after a large number of tosses is $0$, but the standard deviation of the random variable $Y$, which corresponds to a large number $(n)$ of tosses, is $\sqrt{n}$?

How can we both expect the score to be $0$ after a large number of tosses, and expect the score to be $\sqrt{n}$ away from $0$ after a large number of tosses at the same time?


In conclusion...

If the two players were to actually play this game $n$ times, where $n$ is really large...what should we "expect"?!

How should the "Expected Values" of these random variables, and the "Standard Deviations" of these random variables be interpreted?!

Thanks!!

$\endgroup$
1
  • 1
    $\begingroup$ Try to find William Feller’s probability book, volume 1. He writes excellently on the axiomatic, intuitive, and statistical perspectives of probability theory. $\endgroup$ Commented Jul 29, 2019 at 20:49

3 Answers 3

1
$\begingroup$

Your point is very interesting. I would say that both expected value standard deviation would make sense after some $n$ throws. In that game we would expect $Y \rightarrow 0$ because if the result is tails you subtract 1 from the score and over time you would expect to have roughly the same number of heads as you would tails. And the standard deviation makes sense because it is an interval of values that have a reasonable chance of happening. However, I agree that in the game of a single dice throw the two do not make much sense, but I suppose that is because expected value and standard deviation are inherently approximations that gain their true meaning in large sample sizes.

$\endgroup$
1
$\begingroup$

I don't see any contradiction. The standard deviation indicates what values we should expect, and the expected value, counter intuitive to its name, gives the average of the values.

$\endgroup$
4
  • $\begingroup$ I don't understand though! How can the score after many games be $0$ and $\sqrt{n}$ at the same time?! What should I actually expect to happen if this game was played many times? $\endgroup$ Commented Jul 29, 2019 at 20:27
  • $\begingroup$ Each sequence of games, the result is expected to be $\sqrt n$ or $-\sqrt n$. These average out over many sequences of games to 0. $\endgroup$
    – H Huang
    Commented Jul 29, 2019 at 20:29
  • $\begingroup$ But...I thought the expected value of $X$ being $0$ meant that after a sequence of games (one sequence meaning many tosses) we expected the value to be $0$... $\endgroup$ Commented Jul 29, 2019 at 20:36
  • $\begingroup$ No, it's the average of all the possible outcomes after a sequence of games. $\endgroup$
    – H Huang
    Commented Jul 29, 2019 at 23:51
1
$\begingroup$

Suppose we set n = 10.

It is possible but unlikely that one player wins all 10 flips, scoring 10 points.

There is a $(\frac 12)^{10}$ chance of this happening.

There is a $10(\frac 12)^{10}$ that he scores 8 points.

A $45(\frac 12)^{10}$ that he scores 6 points, a $120(\frac 12)^{10}$ that he scores 4 points, etc.

And he can score negative points just as easily.

the mean is the central tendency of the distribution. Standard deviation and variance are both measures of the spread.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .