Eccentric Millionaire Probability Paradox

Question

The following is a probability paradox I've been thinking about. It involves Bayes' rule; if you're not familiar, a good starting example is a urn that has a 50% chance of containing one black ball and one white ball, and a 50% chance of having two black balls. If you reach in at random and pull out a black ball, it becomes more likely there were two black balls to begin with. Specifically, there is now a 1/3 chance of there being black and white balls in the urn, and a 2/3 chance of there being two black balls.

The Setup

Alice and her husband Bob are kidnapped by an eccentric millionaire that performs probability experiments on people. Without them seeing the result, a coin is to be flipped.

If heads, one of the two of them (at random) will be brought to an office with a big red button that, if pushed, will transfer \$5000 to their joint bank account. The other will be left in the holding cell.
If tails, both them them will be brought to separate offices with big red buttons, each with which removes \$2000 from their joint bank account.

They can discuss strategy ahead of time, but after the experiment starts they will be kept separate from each other. What should they do?

The Plan

Alice and Bob reason their only two strategies are to push the button or not. Not pushing the button is a net zero, and pushing the button yields an expected payoff of $1/2 \cdot (5000) + 1/2 \cdot(-4000) = 500$. Not being particularly risk averse, they decide to go ahead and press the button.

The Paradox

The experiment starts, and Alice is summoned to an office with a big red button. She is about to confidently press the button in accordance with their strategy, when she suddenly has second thoughts. The fact that she has been brought to an office and not left in the holding cell gives her new information. Using standard Bayes analysis, the probability the coin was heads is now $1/3$, and the probability the coin was tails is $2/3$. Thus, now the expected value of their strategy is $1/3 \cdot ( 5000) + 2/3 \cdot (-4000) = -1000$.

Suddenly, not pushing the button seems like a good strategy. What's going on? What should Alice do, which analysis was flawed, and what is wrong with the flawed analysis? If it matters to your answer, assume Bob thinks in a very similar way to Alice, and would likely make the same decision as she would.

Seems like an implied reverse-Monty Hall paradox: extra posterior information appears to reduce a player's equity — smci, Commented Jun 8, 2015 at 1:39
The source for this paradox is here, where I've changed the story and added my own reasoning to it. As you can see, this is strongly related to the famous Sleeping Beauty Paradox (aka, the instant flame war: proceed at your own risk). — Tyler Seacrest, Commented Jun 8, 2015 at 5:49

Community · Accepted Answer · 2020-06-17 08:22:35Z

First of all, Alice is correct in that the expected value of the strategy (from her point of view) is now -1000. It would have been best if she had not been summoned to an office, in which case the couple would have won 5000. Since she has been summoned to an office, that possibility is ruled out. Let's look at the various possibilities. I will shorten the cases to A, B, and AB, where A means Alice was picked alone, B means Bob was picked alone, and AB means both were picked. I will use the term equity to mean the expected value of a case multiplied by the probability of that case.

A summary of the original possibilities:

A : +5000 (25% chance) (+1250 equity)
B : +5000 (25% chance) (+1250 equity)
AB: -4000 (50% chance) (-2000 equity)
Total equity: +500 (as stated in "The Plan")

A summary of the remaining possibilities once Alice is selected:

A : +5000 (33% chance) (+1667 equity)
AB: -4000 (67% chance) (-2667 equity)
Total equity: -1000 (as stated in "The Paradox")

This all makes sense so far. There was a 75% chance of Alice being selected, with expected value -1000, and a 25% chance of Alice not being selected, with expected value +5000.

-1000 * 0.75 + 5000 * 0.25 = 500 equity, which is the original equity.

So far so good. But Alice should still press the button. Why? Because her equity from pressing the button is positive:

A: +5000 (33% chance) (+1667 equity)
AB: -2000 (67% chance) (-1333 equity)
Total equity: +333

Note that in the AB case, her pressing the button only loses 2000 dollars and not 4000. Bob pressing his button and losing the other 2000 dollars is "a done deal" which results in -1333 equity, but that doesn't affect whether Alice should press her button or not.

In other words, Bob is going to press the button and lose 1333 equity. If Alice presses the button, she will regain 333 equity to make the final result -1000. If she does not press the button, the result will remain -1333 which is worse.

Edit: A commenter said, why doesn't Alice just not press the button and end up with 0 instead of -1000?

Response: Because it's always correct to press the button. Suppose Bob decided to "wussy out" and not press the button. That means that instead of Bob losing 1333 equity, he will have lost 0 equity. Alice should still press the button and gain 333 equity. Her decision is independent of Bob's decision. Also, remember that the -1000 equity doesn't mean that pressing the button is going to lose money. You always have to factor in the +5000 in case B that makes the final equity +500. The decision is between pressing the button and making +500 overall considering all 4 cases and not pressing the button and making 0.

Another way to think about it

Here's another way to think about it that might clear up the "paradox" for some people. Suppose Alice and Bob were strangers instead of a couple. And the rules were that if you press the button in heads case, you alone get 5000 dollars. And in the tails case, anyone who presses the button loses 2000 dollars from their own bank account. Now it is clear that you actually want to be picked. Because if you are, your expected return from pressing the button is (1/3) * 5000 - (2/3) * 2000 = +333 dollars. It doesn't matter that the other person might lose 2000 dollars. Their best strategy overall is also to press the button. It just happens that if you get picked, you know they will be losing 2000 dollars by pressing the button.

If she doesn't press the button, Bob doesn't press as well, so they don't lose anything. Isn't it? — leoll2, Commented Jun 7, 2015 at 9:50
I think I agree with this solution. Instinctively I know it's good to always push the button no matter what because if you look at it as a closed system with no possibility of choice (they must press the button), it's a net gain. — Quark, Commented Jun 7, 2015 at 16:09
Hard to choose a best answer with so many good ones, but I'll go with this one. Thanks for the nice response JS1. — Tyler Seacrest, Commented Jun 8, 2015 at 5:09

Mike Earnest · Accepted Answer · 2015-06-07 16:58:27Z

It is rational for both parties to press the button, causing them to win $500$ on average. Here is what is wrong with the reasoning in the Paradox section.

Once Alice is called in, she reasons that the strategy will lose them money. That is ok, since when Alice is not called in, she reasons that the strategy will win them money (when she is not called in, the strategy wins them $\$5000$). So, $3/4$ of the time, Alice will be called in and expect to lose $1000$ dollars, but this is balanced out by the $1/4$ chance that she won't get called in, winning $5000$. The expected value is $(1/4)\cdot 5000-(3/4)\cdot 1000=500$, as expected.

To sum up, the flaw is that Alice only reasons that the strategy is bad when it has started going badly, but on average, it is still good.

Another way to see it: once Alice is caused in, what is the expected value of her pushing the button (ignoring what Bob does, since that does not affect the money Alice wins the team)? There is a $1/3$ chance it will gain the team $5000$, and a $2/3$ chance it will lose the team $2000$ (not $4000$, we are ignoring Bob's actions here). The expected value of Alice pushing the button is $(1/3)5000-(2/3)2000\approx 333$, so she should push it.

In short, their original strategy is right, and both analyses are right, and the only thing that was wrong was mistakenly inferring that one analysis proved the strategy was wrong. — user1502, Commented Jun 8, 2015 at 2:37

bobble · Accepted Answer · 2021-02-10 23:31:40Z

I wonder if my contribution will help at all (or whether it will simply confuse more). Hopefully the former.

As good, Bayesian, reasoners Alice and Bob have correctly worked out their best strategy. That is "both push".

It is helpful to think of the probability space as containing 4 possibilities. Clearly if the coin flips to "heads" there are then two possibilities "A picked" or "B picked". Only one possible outcome occurs when "tails" is picked, but dividing it into two makes all parts of the possibility space the same size (i.e. equiprobable).

So for the strategy "both press the button" we have (I would use a table here if I could):

A	B	Coin flip	Gain
office	holding cell	HEADS	5000
holding cell	office	HEADS	5000
office	office	TAILS	-4000
office	office	TAILS	-4000
			2000 (total)

Since I have divided the space into 4, the expected gain is 2000 divided by 4 equals 500.

So the first part of the OP's reasoning is right. But if A finds herself in an office she knows that one of the 4 possibilities, worth 5000, did not come off. Her expected gain is now reduced to -1000 according to the following table:

A	B	Coin flip	Gain
office	holding cell	HEADS	5000
office	office	TAILS	-4000
office	office	TAILS	-4000
			-3000 (total)

The probability space now has size 3, so the expected gain is -3000 divided by 3, -1000.

A Bayesian should not be surprised by this. Expectations are based on probabilities and, to a Bayesian, probabilities represent states of knowledge. A now knows more than she did and, as a result, knows that their gamble is less likely to have come off. Her expectation is reduced.

But should she change her actions having found herself in the office? To see that she should not, consider that Alice and Bob can agree at the outset what their strategy will be.

There are four "pure" strategies (I leave it as an exercise to see that mixed strategies - i.e. ones where Alice or Bob randomly choose what to do - are never better). Either both don't press the button, A presses, B presses or both press. All this is on the assumption they find themselves in an office. Obviously if they are in the holding cell they can do nothing.

The four strategies give the following results:

Strategy	Payoff
Both don't press	0
Only A presses	5000/4 - 2x2000/4 = 1000/4 = 250
Only B presses	5000/4 - 2x2000/4 = 1000/4 = 250
Both press	2x5000/4 - 2x4000/4 = 2000/4 = 500

Note: if only one presses there are still four (not 3) possibilities. One of them is boring, the person designated to press is in a holding cell and nothing is gained or lost.

Hopefully it is clearer if the two have an advance strategy. Having A "change her mind" on finding out more information is equivalent to deciding in advance what she will do if she finds that information out. In this problem there's only one bit of information (office or not) and the only time a choice must be made is in one of those situations "office", so the strategies can be fairly simple.

Note that if the "A presses" strategy is used, then when A find herself in the office the expectation is now 1000/3 (roughly 333) as previously explained. Although "A presses" has an expectation of 250, once A has found herself in the office she now knows more information, so the probabilities that form part of the expectation change and her new expectation is higher (333) because she now knows that the one situation (her in the holding cell) when they would certainly gain nothing has been ruled out.

Expectation values can change as you gain more information, but that doesn't mean that strategies need to.

randomUser · Accepted Answer · 2015-06-08 01:48:47Z

The question has already been answered, but I would like to pinpoint exactly why it seems paradoxical. Let $W$ be the total winnings of Alice and Bob when they follow the plan of always pushing the button, $A$ the event that Alice is chosen and $B$ the event that Bob is chosen. Then Alice correctly reasons that $E[W|A] <0$ and for the same reason $E[W|B] < 0$. As one of Alice and Bob must be chosen, it seems intuitively clear that this means that $E[W] < 0$, but in fact the reasoning given under "The Plan" shows that $E[W] > 0$. This happens because $A$ and $B$ overlap, or in other words, the naive reasoning ignores that they can both be chosen.

Here is a similar situation. Suppose you are allowed to roll a die with the payoffs in dollars given as:

 roll: 1  2  3  4  5  6

 pays: 6 -3 -2 -2 -3  6

Consider the following questions:

Would you play the game?
If you knew that the roll was $\le 4$, would you play the game?
If you knew that the roll was $\ge 3$, would you play the game?
If you were told in advance that the roll would either be $\le 3$ or $\ge 4$, would you play the game?

I really like this post. To take the analogy a bit further, consider this: Suppose Alice and / or Bob may be asked if they want to play this game, and if either says yes they collectively play it once. Alice is asked if the roll is $\leq 4$, and Bob is asked if the roll is $\geq 3$. Individually it seems bad to play the game when asked, but that ignores the fact that if roll is 3 or 4, they both agree to play but only have to pay the bad result once. — Tyler Seacrest, Commented Jun 8, 2015 at 5:22

David Hammen · Accepted Answer · 2015-06-07 18:18:28Z

2

The fact that she has been brought to an office and not left in the holding cell gives her new information.

There is no new information here.

There was a 100% a priori chance that one or both would be brought to an office with a button. That she was the recipient of this 100% chance adds zero information. In particular, where's Bob? She doesn't know. It is still a fifty-fifty proposition whether he's been left in the holding cell or has been brought to an office with a big red button. There would be new information had Alice been left in the holding cell. In that case, she would be quite justified in getting mightily upset at Bob had he fallen prey to the erroneous thinking presented on the question and failed to push the +5K BRB.

So what should Alice do?

Simple: Alice should press the BRB as originally planned, but she should also hope that Bob will fall prey to the logic presented in the question.

answered Jun 7, 2015 at 18:18

David Hammen

1965 bronze badges

$\begingroup$ Given that Alice is not in her cell, it's 1:2 Bob is still in his cell, not 1:1. And this is new information, since it's 1:3 without the knowledge of Alice. $\endgroup$
– user1502
Commented Jun 8, 2015 at 2:06
$\begingroup$ There is indeed new information. Alice knows now that the couple are more likely to lose money. The trick here is that we are made to believe that pressing the button will lose her 4,000 when it only loses 2,000 - Bob loses the other 2,000. Using the correct numbers, the original decision is still correct. $\endgroup$
– gnasher729
Commented Jun 8, 2015 at 17:07
$\begingroup$ Another point worthy of consideration is that there's a 100% chance that everyone whose decision would matter will be brought to an office with a button. If someone is incapable of changing the game's outcome (directly, or by giving information to someone who could), no information that person receives will be relevant to the game. $\endgroup$
– supercat
Commented Jun 11, 2015 at 17:14

Add a comment |

score 2 · Accepted Answer · 2017-12-11 00:44:11Z

There have been several answers that explain a myriad of different paths to the correct answer; IMO, none of them actually provide an adequate resolution of the paradox.

First off, let me be clear that:

Alice's correct course of action is to press the button
She has correctly reasoned that her team expects to lose \$1000 if they stick to the plan.

The flaw in Alice's reasoning is that she incorrectly accounts for the information that Bob will think the same as she does.

Suppose Alice was unsure how Bob would react, and estimates that if Bob is called, he will stick to the plan with probability $p$. Then:

With $1/3$ probability, Bob is still in his cell. If Alice sticks to the plan, they win \$5000, otherwise they break even.
With $2p/3$ probability, Bob was picked and presses the button. If Alice sticks to the plan, they lose \$4000, otherwise they lose \$2000.
With $2(1-p)/3$ probability, Bob was picked and doesn't press the button. If Alice sticks to the plan, they lose \$2000, otherwise they break even.

Adding it up, Alice reasons that the expected value of pressing the button is $\$333 - \$1333 \cdot p$, and of not pressing the button is $-\$1333 \cdot p$. Thus, Alice infers that pressing the button is correct no matter how she thinks Bob will react.

It's tempting to make a similar analysis where Bob has some probability of making the same choice as Alice, and otherwise comes up with some other random line of reasoning as above, but this is wrong: Bob doesn't have information about Alice's choice, so he can't factor that into his reasoning.

While our given information that Bob thinks very similarly to Alice means that his choice will be strongly correlated with Alice's, it must still be made independently of Alice's. Mathematically, we have

P(Bob makes the same choice as Alice) = nearly 1
P(Bob pushes | Alice pushes) = P(Bob pushes)
P(Bob doesn't push | Alice doesn't push) = P(Bob doesn't push)

When Alice and Bob's choices are truly dependent, things can change. See Charlie below.

This mistake of accounting for her knowledge of Bob can be viewed as incorrectly assigning credit/blame to her teams actions: the blame for losing \$4000 if they stick to the strategy is evenly divided between Alice and Bob. With the blame allocated correctly, she reasons that pressing the button is worth $\$5000 \cdot \frac{1}{3} - \$2000 \cdot \frac{2}{3} \approx \$333$.

It may be instructive to compare everything with another variation of the game. The millionaire kidnaps Charlie and gives him a big red button that has the same effect as if Alice and/or Bob pressed their buttons. (i.e. Charlie wins \$5000 if one was picked and loses \$4000 if both were picked)

The game the millionaire plays with Charlie is that he will reveal to Charlie whether or not Alice was picked, and then Charlie has to decide whether or not to press his button.

When Charlie learns that Alice was picked, his correct decision is not to press the button.

user41805 · Accepted Answer · 2015-06-07 09:30:00Z

0

I would advise Alice to:

Push the button!

Reason:

Consider the probabilty where Bob will be. There is a 1/2 probability that he is in the room with the red button(R) and another 1/2 chance he is in the holding room(H). If she considers these values into her equation, there is no paradox at all and the net profit will still be $500!

.

The reason for the flaw is that she only thought of herself. Because this mad millionaire is testing on BOTH of them, she will need to consider Bob too to calculate the odds.

I hope this solves your question.

answered Jun 7, 2015 at 9:30

user41805

1,97621 silver badges29 bronze badges

1

$\begingroup$ This is incorrect. Given that Alice is with a big red button, Bayes's theorem tells us that there's a 2/3 chance that Bob is too and a 1/3 chance that he's still in a cell. $\endgroup$
– Rand al'Thor
Commented Jun 7, 2015 at 9:43
$\begingroup$ Forget about Alice. Just think about Bob. Where would he be? $\endgroup$
– user41805
Commented Jun 7, 2015 at 9:47
$\begingroup$ If you forget about Alice, then there's a 75% chance Bob's in the office and 25% chance Bob's in his cell. But that fact doesn't help us properly account for our knowledge of Alice. $\endgroup$
– user1502
Commented Jun 8, 2015 at 2:06

Add a comment |

Rand al'Thor · Accepted Answer · 2015-06-07 09:57:39Z

Alice should

not push the button.

Reason:

Let $O$ and $C$ denote office and holding cell, $A$ and $B$ Alice and Bob, and $H$ and $T$ heads and tails. Then we have conditional probabilities $p(A\in O|H)=\frac{1}{2},p(A\in O|T)=1$, and hence $p(A\in O)=\frac{1}{2}.\frac{1}{2}+\frac{1}{2}.1=\frac{3}{4}$, so by Bayes's theorem $p(H|A\in O)=(\frac{1}{2}.\frac{1}{2}) / \frac{3}{4}=\frac{1}{3}$. I.e. given that Alice is in the office, there's a $\frac{1}{3}$ probability that the coin came up heads (i.e. that Bob is still in the cell) and a $\frac{2}{3}$ probability that it came up tails (i.e. that Bob is in front of another button).

We're told that Alice and Bob both think in the same way, so either Alice presses her button and if Bob is in front of a button, he'll press it too or Alice doesn't press her button and if Bob is in front of a button, he won't press it either. In the first case, they make an expected gain of $\frac{1}{3}(5000)+\frac{2}{3}(-4000)=-1000$; in the second, they make an expected gain of $0$, which is better.

Flaw in the other argument:

"Alice and Bob reason their only two strategies are to push the button or not" - but they don't realise that the fact of one of them having a button to push gives them new information! They're seeing themselves as a single entity, not taking into account that they'll be separated and have to consider probabilities of where the other one is based only on what they know about themselves.

I'm not sure I understand your objection in "Flaw in the other argument". Are you saying that there is a possible strategy other than "Push the button if given the opportunity" and "Do not push the button"? — Julian Rosen, Commented Jun 7, 2015 at 13:53

Jeff · Accepted Answer · 2015-06-07 15:16:39Z

I think their strategy is wrong.

When they collaborate, they should designate either Alice or Bob, but not both, as a presser, and the other as a non-presser. Suppose Alice is the designated presser; Bob is no longer part of the consideration, since we know he can't have any effect on the net payout. There is then a 50% chance that Alice makes \$5000, and a 50% chance she loses \$2000, for an expected payout of $\frac{\$5000-\$2000}{2}$ = \$1500, which is better than the $500 expectation of the "both press" situation. This doesn't resolve the paradox, but unless I made a mistake, it does show that the paradox shouldn't even be considered, because it resulted from a strategy that never should've been followed in the first place.

EDIT: Ouch, I did bad math.

There is a 25% chance she alone is selected, and a 50% chance they're both selected, reducing the expected payout of a designated presser to $\frac{\$5000-$2000 -\$2000}{4}$ = \$250. But by symmetry, shouldn't a "both press" situation result in a \$500 gain?

"by symmetry, shouldn't a "both press" situation result in a $500 gain?" Now you know why this is called a "paradox"... — Alexander, Commented Jun 8, 2015 at 12:53
The both press situation does result in a expected value of $500 per press. — Danikov, Commented Jun 10, 2015 at 1:35

Fillet · Accepted Answer · 2015-06-08 12:17:08Z

Consider the following variant:

Alice goes to her office, checks out the wiring and spots a (deliberate, who knows?) mistake. Her button has been connected in such a way that pressing the button causes Bob's button to be pressed, provided that Bob is also in his office.

In this case, accepting the 2/3 probability of Tails given that Alice is in her office, the result of pressing her button is

1/3 * 5000 - 2/3 * 2000 - 2/3 *2000 = -1000

Which is exactly the expression given in the Paradox. If it were so wired, then she should not touch the button. But in the original question, this was not the case.

Bob's button is not pressed by any action of Alice, and it is incorrect to blame Alice for Bob pressing his button when they lose. Alice and Bob have the same strategy, and Bob's button pressing is correlated with Alice's according to the strategy, but it is not caused by her pressing like in the variant above.

Danikov · Accepted Answer · 2015-06-09 13:02:15Z

Alice should

still press

Reasoning

She is conflating the probability of her personal involvement with the probability of the outcome. She is involved 100% of the time when they both press to lose \$4000, but she is only involved 50% of the time when one of them presses to gain \$5000. This disparity is more obvious if you split the gains and losses and make Alice and Bob strangers, which turns the problem into a prisoners dilemma; Alice would gain more personally not pressing while Bob presses, however as they are cooperating, both pressing works overall to their benefit.

Her faulty math can be fixed by including the possibility of Bob pressing alone:

$P($Alice's Involvement$) \cdot EV($Alice's Calculation$) + P($Bob's involvement without Alice$) \cdot EV($Bob presses alone$)$

$3/4 \cdot -\$1000 + 1/4 \cdot \$5000 = \$500$

Stack Exchange Network

Eccentric Millionaire Probability Paradox

The Setup

The Plan

The Paradox

11 Answers 11

Another way to think about it

Not the answer you're looking for? Browse other questions tagged
probability
paradox
or ask your own question.

Hot Network Questions

Eccentric Millionaire Probability Paradox

The Setup

The Plan

The Paradox

11 Answers 11

Another way to think about it

Not the answer you're looking for? Browse other questions tagged probabilityparadox or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
probability
paradox
or ask your own question.