55
$\begingroup$

In one of his interviews, Clip Link, Neil DeGrasse Tyson discusses a coin toss experiment. It goes something like this:

  1. Line up 1000 people, each given a coin, to be flipped simultaneously
  2. Ask each one to flip if heads the person can continue
  3. If the person gets tails they are out
  4. The game continues until 1* person remains

He says the "winner" should not feel too surprised or lucky because there would be another winner if we re-run the experiment! This leads him to talk about our place in the Universe.

I realised, however, that there need not be a winner at all, and that the winner should feel lucky and be surprised! (Because the last, say, three people can all flip tails)

Then, I ran an experiment by writing a program with the following parameters:

  1. Bias of the coin: 0.0001 - 0.8999 (8999 values)
  2. Number of people: 10000
  3. Number of times experiment run per Bias: 1000

I plotted the Probability of 1 Winner vs Bias

enter image description here

The plot was interesting with zig-zag for low bias (for heads) and a smooth one after p = 0.2. (Also, there is a 73% chance of a single winner for a fair coin).

Is there an analytic expression for the function $$f(p) = (\textrm{probability of $1$ winner with a coin of bias $p$}) \textbf{?}$$

I tried doing something and got here: $$ f(p)=p\left(\sum_{i=0}^{e n d} X_i=N-1\right) $$ where $X_i=\operatorname{Binomial}\left(N-\sum_{j=0}^{i-1} X_j, p\right)$ and $X_0=\operatorname{Binomial}(N, p)$

$\endgroup$
6
  • 5
    $\begingroup$ This is a nice question. I took the liberty of embedding the images and tweaking the format to make it a little clearer what the question is. I encourage you to typeset your work instead of using an image for it. $\endgroup$ Commented May 24 at 18:55
  • 9
    $\begingroup$ I was very skeptical about the zig-zag behavior until I did my own simulation, but I used 10 times your replications and it didn't go away. $\endgroup$
    – heropup
    Commented May 24 at 19:21
  • 2
    $\begingroup$ "He says the "winner" should not feel too surprised or lucky" - Even though he kind of phrased it that way, I think his point was not that the winner should not feel lucky, but that people should not feel too surprised that there is a winner at all, in the sense that if you have enough trials and wait a long enough time, an event that's unlikely in the context of a single trial might have an appreciable probability in this wider context. $\endgroup$ Commented May 25 at 14:57
  • 1
    $\begingroup$ Related: Uniqueness of longest run of Bernoulli experiment and answer; also this. $\endgroup$
    – r.e.s.
    Commented May 25 at 18:19
  • 16
    $\begingroup$ @FilipMilovanović: I like Matt Parker's phrasing: "It's amazing if you win the lottery. That's incredible. It's not amazing if someone wins the lottery." $\endgroup$
    – Kevin
    Commented May 25 at 19:51

6 Answers 6

40
$\begingroup$

It is known (to a nonempty set of humans) that when $p=\frac12$, there is no limiting probability. Presumably the analysis can be (might have been) extended to other values of $p$. Even more surprisingly, the reason I know this is because it ends up having an application in number theory! In any case, a reference for this limit's nonexistence is Primitive roots: a survey by Li and Pomerance (see the section "The source of the oscillation" starting on page 79). As the number of coins increases to infinity, the probability of winning (when $p=\frac12$) oscillates between about $0.72134039$ and about $0.72135465$, a difference of about $1.4\times10^{-5}$.

$\endgroup$
31
$\begingroup$

There is a pretty simple formula for the probability of a unique winner, although it involves an infinite sum. To derive the formula, suppose that there are $n$ people, and that you continue tossing until everyone is out, since they all got tails. Then you want the probability that at the last time before everyone was out, there was just one person left. If $p$ is the probability that the coin-flip is heads, to find the probability that this happens after $k+1$ steps with just person $i$ left, you can multiply the probability that person $i$ survives for $k$ steps, which is $p^k$, by the probability that he is out on the $k+1^{\rm st}$ step, which is $1-p$. You also have to multiply this by the probability that none of the other $n-1$ people survive for $k$ steps, which is $1-p^k$ for each of them. Multiplying these probabilities together gives $$ p^k (1-p) (1-p^k)^{n-1} $$ and summing over the $n$ possible choices for $i$ and all $k$ gives $$ f(p)= (1-p)\sum_{k\ge 1} n p^k (1 - p^k)^{n-1}.\qquad\qquad(*) $$ (I am assuming that $n>1$, so $k=0$ is impossible.)

Now, the summand in (*) can be approximated by $n p^k \exp(-n p^k)$, so if $n=p^{-L-\epsilon}$, $L\ge 0$ large and integral, $0\le \epsilon \le1$, $f(p)$ will be about $$ (1-p) \sum_{j\ge 1-L} p^{j-\epsilon} \exp(-p^{j-\epsilon}) $$ and we can further approximate this by summing over all integers: if $L$ becomes large and $\epsilon$ approaches some $0\le \delta \le 1$, $f(p)$ will approach $$ g(\delta):=(1-p)\sum_{j\in\Bbb Z} p^{j-\delta} \exp(-p^{j-\delta}). $$ The average of this over $\delta$ has the simple formula $$ \int_0^1 g(\delta) d\delta = (1-p)\int_{\Bbb R} p^x \exp(-p^x) dx = -\frac{1-p}{\log p}, $$ which is $1/(2 \log 2)\approx 0.72134752$ if $p=\frac 1 2$, but as others have pointed out, $g(\delta)$ oscillates, so the large-$n$ limit for $f(p)$ will not exist. You can expand $g$ in Fourier series to get $$ g(\delta)=-\frac{1-p}{\log p}\left(1+2\sum_{n\ge 1} \Re\left(e^{2\pi i n \delta} \,\,\Gamma(1 + \frac{2\pi i n}{\log p})\right) \right). $$ Since $\Gamma(1+ri)$ falls off exponentially as $|r|$ becomes large, the peak-to-peak amplitude of the largest oscillation will be $$ h(p):=-\frac{4(1-p)}{\log p} \left|\Gamma(1+\frac{2\pi i}{\log p})\right|, $$ which, as has already been pointed out, is $\approx 1.426\cdot 10^{-5}$ for $p=1/2$. For some smaller $p$ it will be larger, although for very small $p$, it will decrease as the value of the gamma function approaches 1 and $|\log p|$ increases. This doesn't mean that the overall oscillation disappears, though, since other terms in the Fourier series will become significant. To illustrate this, here are some graphs of $g(\delta)$. From top to bottom, the $p$ values are $0.9$, $0.5$, $0.2$, $0.1$, $10^{-3}$, $10^{-6}$, $10^{-12}$, and $10^{-24}$. As $p$ becomes small, $$ g(0)=(1-p)\sum_{j\in\Bbb Z} p^{j} \exp(-p^{j}) \ \ \qquad {\rm approaches} \ \ \qquad p^0 \exp(-p^0) = \frac 1 e. $$

Graphs of g(delta)

References

$\endgroup$
1
  • 2
    $\begingroup$ It's answers like this that make me remember people on this site are really freakin' smart. $\endgroup$
    – msantama
    Commented May 28 at 0:28
12
$\begingroup$

An extended comment:

I doubt there is an analytic expression, though there will be a recursion for a finite number of starting players. I also suspect there may not be a limit for a given probability of heads as the number of starting players increases.

Here are the probabilities for up to $10000$ starting players when using a fair coin $(p=0.5)$. The rounded probabilities for ending up with a single remaining player are close to $0.72135$ but the chart is not persuasive that there is a single limit.

I suspect your zig-zag observations result from this fluctuation: $10000$ starters may be close to the top of the curve for some probabilities of heads and close to the bottom for nearby probabilities of heads.

enter image description here

drawn with R:

pheads <- 0.5 
maxn <- 10000
pwin <- 1
for (n in 2:maxn){
  pwin[n] <- sum(dbinom(1:(n-1), n, pheads) * pwin) /(1-dbinom(n, n, pheads)) 
  }
midright <- mean(pwin[(maxn/2):maxn])
rangeright <- max(pwin[(maxn/2):maxn]) - min(pwin[(maxn/2):maxn])
plot(pwin, ylim=c(midright-rangeright, midright+rangeright), type="l")

With a smaller probability of heads, say $0.1$, the range of the fluctuations is substantially wider:

enter image description here

$\endgroup$
5
  • $\begingroup$ Hey, Thanks! I tried simulating it for a fixed probability and variable people and found similar plots. In the case of a fixed number of people and variable probability, it showed consistent behaviour after p=0.2; I expected it to fluctuate as well. It still begs the question of what makes it more likely for say, 4000 people to have a winner while 5000 people are less likely, and 6000 people are more likely again $\endgroup$ Commented May 24 at 20:23
  • 2
    $\begingroup$ For p=0.5, 1/p=2, and $2^{10}=1024$, $2^{11}=2048$, $2^{12}=4096$, so these values are 'high' in the first graph, and low points are $2^{9.5}$, $2^{10.5}$ etc ; for any $p$, for $n$ people at the beginning, if ln(n) / ln(p) is close to an integer k, the probability to reach 1 is high, and we should reach 1 after k steps. $\endgroup$
    – Lourrran
    Commented May 24 at 20:30
  • 4
    $\begingroup$ MysticMickey: Suppose you look at a probability of heads of $\frac1{10}$. Then with $0$ players left you lose, with $1$ player left you win, with $2$ players left you have a probability of $\frac{2}{11} \approx 0.18$ of winning. If you start with $10$ people you are quite likely to end up with $1$ player after a single round and then win; if you start with $20$ people you are quite likely to end up with $2$ players after a single round and then lose. You might see something broadly similar with $100$ and $200$ starting people but an extra round. And more generally as @Lourrran suggests. $\endgroup$
    – Henry
    Commented May 24 at 20:37
  • $\begingroup$ Thanks @Lourrran Henry. This also explains why the probability fluctuates less as p increases! $\endgroup$ Commented May 24 at 20:52
  • 1
    $\begingroup$ These intuitions about the fluctuations seem worth including in the answer itself. $\endgroup$
    – David K
    Commented May 26 at 15:31
7
$\begingroup$

It seems like there are already great answers to this question, but I just wanted to take a shot at it.

When I watched the video clip about the coin flipping game, the first thing that came to mind was the following ideas:

  • Suppose in a game, what happens if all players get Tails on the first flip (i.e. round=1)? Does this mean that there were no winners?
  • Suppose in some other game, what if in round=5 there are 41 players left - and then in round=6, all players flip Tails? Does this also mean that there are no winners?

To me, it seems that in this coin flipping game, there might not always be a winner.

I was curious and wrote an R simulation myself.

First, I defined a function to run the coin flipping game that was described in the video:

simulate_game <- function(num_players, p_heads) {
  results <- data.frame(player = paste0("player", 1:num_players))
  status <- rep("playing", num_players)
  seen_tails <- rep(FALSE, num_players)
  round <- 1
  
  # simulation
  while (sum(status == "playing") > 1) {
    coin_flips <- ifelse(runif(num_players) < p_heads, "heads", "tails")
    seen_tails <- ifelse(coin_flips == "tails", TRUE, seen_tails)
    current_round <- ifelse(status == "playing", coin_flips, "eliminated")
    status <- ifelse(seen_tails == TRUE, "eliminated", status)
    results <- cbind(results, current_round)
    colnames(results)[ncol(results)] <- paste0("round_", round)
    round <- round + 1
  }
  
  # if only one player left playing, declare them as the winner
  if (sum(status == "playing") == 1) {
    winner_index <- which(status == "playing")
    results[winner_index, ncol(results) + 1] <- "winner"
    results[-winner_index, ncol(results)] <- "eliminated"
    colnames(results)[ncol(results)] <- paste0("round_", round)
  }
  
  num_winners_last_round <- sum(results[,ncol(results)] == "winner")
  return(num_winners_last_round)
}

run_simulations <- function(num_players, num_simulations, p_values) {
  avg_num_winners_list <- numeric(length(p_values))

This code looks at the average number of winners in a set of games. The way I see it, an individual game can either have $1$ winner or $0$ winners. In a set of $n$ games, the average number of winners must be between $(0,1)$.

Now, I repeated this simulation for 1000 players and different values of success probabilities for the coin:

num_players <- 100
num_simulations <- 100
p_values <- seq(0.01, 0.99, by = 0.01)
final_results <- run_simulations(num_players, num_simulations, p_values)
print(final_results)

Finally, I plotted the results:

library(ggplot2)
ggplot(final_results, aes(x = p, y = avg_num_winners)) +
    geom_line() +
    labs(title = "Average Number of Winners vs Probability of Heads",
         x = "Probability of Heads (p)",
         y = "Average Number of Winners") +
    theme_minimal()

I also got a similar zig-zag shaped plot:

enter image description here

After doing all this work, I just realized that my answer does not really provide any anything of value to this question ... there are no mathematical derivations in my answer, and previous users have submitted similar simulations as mine that seem to be far more efficient ... but in the spirit of mathematics and curiosity, I wanted to share my ideas!

Great job to OP, everyone who posted an answer and everyone who read and also thought about ideas!

$\endgroup$
2
  • 1
    $\begingroup$ "There are no mathematical derivations in my answer..." That's okay. Some people follow logic more easily in code. And it's good to try expressing ideas in different formats and examining from various angles. I see nothing wrong with it. $\endgroup$
    – Mentalist
    Commented May 27 at 8:03
  • $\begingroup$ @ Mentalist : thank you for your support! $\endgroup$
    – konofoso
    Commented May 27 at 14:33
7
$\begingroup$

Simulations are well and good, but if you want analytic results you need to treat this problem as a Markov chain. The states are given by the numbers of remaining people, from $0$ to whatever the number of people you start out with is, let's say $n$.

Observe that both $0$ and $1$ are absorbing states. We want to determine the probability that state $k$ will end up in the absorbing state $1$ eventually. Since the state transitions are totally ordered (no person can re-enter the game once they're out), this can be solved straightforwardly with backward induction.

Observe furthermore that the transition probabilities between states are given by the binomial distribution. A transition from $n$ to $k$ with $n\geq k$ occurs in each step with probability $B_{n,p;k}=\binom{n}{k}p^k(1-p)^{n-k}$, where $p \in (0,1)$ denotes the probability that one coin lands heads, keeping the player alive.

With this, we can begin the inductive process. Let $w_k$ denote the probability that state $k$ will eventually end up in state $1$, leading to a unique winner of the game. We have:

$w_0 = 0$

$w_1 = 1$

$w_2 = B_{2,p;2}w_2+B_{2,p;1}w_1 = p^2w_2+2p(1-p)1 \iff w_2=\frac{2p(1-p)}{1-p^2}$

...

$w_k = \sum_{i=1}^kB_{k,p;i}w_i \iff w_k=\frac{\sum_{i=1}^{k-1}B_{k,p;i}w_i}{1-B_{k,p;k}}$

For any finite number, this yields a closed form solution as it includes just finitely many individual calculation steps. So $w_{1000}$ does have a closed form solution, even though it is probably too large to feasibly write down. Now, this isn't saying $w_n$ has a general solution in closed form. It may or may not, though my hunch is no. The method above requires a growing number of calculation steps as $n$ increases.

The exact result e.g. for $p=1/2, n=100$ is:

62679617659887749988746213470851836113741964913902528922511448994328489763653864096767507935863398677311629114459544046285444888961576667322773922896895955781629729326922709710409021226156576946745722296797445217949518127366802648645133203052528983810483052457946476329495357314994988638375847044238900492391360621033700774719462413038092652895803131646129048816076558232952576535312196097213171545111287268383567849506192874997518701482940352790757985554110002281105656215802412515704365255563332555125783287017146039591371543167955227203712515273940563538530755798767563527683205743066644698878477451964888492675903548799151412895334864392216825070288126612996964792640010845874081668809189902256199168241965557644040471946691718628884314486897229211150174267701947584366283678796661270270637735818712753244225381519788228409191793751176162361892606423059087638822626342274264795739573188/86892064279492453953188039648187133334016414403874605165793915801213956772656166517539872814657519043503730828124464647673414852936009548075772881250175976560263407433825694658916832128958154801811841312392669310267956415090926520793270433785315356585513778150072989636862246760300455530349881323266836600414981325512923835697446109043370344650170033834847719739217764180203723309543189563413799270524852652673694333430784981501004469385298881607442883928128633834952061989755616387861671386393210001155450626048953943271575507984786499318923920444756186659511566552332139050931523720014886264749259665189174978677777661807919432101541602177880774148088728531361612430544375585266902505009650854008931537438449984971759466224084364168665063565901302922285567268108340098687899223563586659152304055907858269892070717015278007341200438037828696715806927256523797229075194867698563806844410715

and the formula for general $p$ for $n=10$ already blows up to:

10*p*(p^30 - 7*p^29 + 32*p^28 - 75*p^27 + 121*p^26 - 146*p^25 + 200*p^24 - 229*p^23 + 222*p^22 - 197*p^21 + 293*p^20 - 309*p^19 + 334*p^18 - 334*p^17 + 393*p^16 - 346*p^15 + 393*p^14 - 334*p^13 + 334*p^12 - 309*p^11 + 293*p^10 - 197*p^9 + 222*p^8 - 229*p^7 + 200*p^6 - 146*p^5 + 121*p^4 - 75*p^3 + 32*p^2 - 7*p + 1)/(p^31 + 2*p^30 + 5*p^29 + 9*p^28 + 16*p^27 + 25*p^26 + 38*p^25 + 53*p^24 + 72*p^23 + 92*p^22 + 114*p^21 + 135*p^20 + 155*p^19 + 171*p^18 + 183*p^17 + 189*p^16 + 189*p^15 + 183*p^14 + 171*p^13 + 155*p^12 + 135*p^11 + 114*p^10 + 92*p^9 + 72*p^8 + 53*p^7 + 38*p^6 + 25*p^5 + 16*p^4 + 9*p^3 + 5*p^2 + 2*p + 1)

Not much hope of finding succinct simplifications here.

$\endgroup$
5
$\begingroup$

There are already a bunch of great answers: [1] gives formula for $f(p)$ in terms of infinite series and, moreover, explains its oscillating behavior. I intend to elaborate on Markov chain approach used by [2], but in matrix form instead of recurrence relations, showing that it is possible to reproduce formula for $f(p)$ in terms of infinite series and in terms of finite series of rational functions.

Let's start from a coin toss game with $N$ coins (players) and only one absorbing state with zero coins. Let $p$ be the probability of getting heads (or player staying after one round). Transition probabilities are described by binomial distribution $$T_{i j} = P(i\,\text{coins} \rightarrow j\,\text{coins}) = \binom{i}{j} p^j (1 - p)^{i - j},\quad i = 0, 1, \ldots N,\quad j = 0, 1, \ldots i$$

Transition matrix of this game is multiplicative in a sense that: $$T(p_1) T(p_2) = T(p_1 p_2)$$ Proof is rather direct.

Notice that $T$ is lower-triangular block-diagonal matrix $$T(p) = \begin{pmatrix} A & 0 \\ C & D \end{pmatrix}$$ where $A$ and $D$ are arbitrary square blocks. Diagonal blocks do not depend on other blocks if matrices are multiplied. $$\begin{pmatrix} A & 0 \\ C & D \end{pmatrix} \begin{pmatrix} A^\prime & 0 \\ C^\prime & D^\prime \end{pmatrix} = \begin{pmatrix} A A^\prime & 0 \\ C A^\prime + D C^\prime & D D^\prime \end{pmatrix}$$

Now consider coin toss game from the question, in which state with one coin is also absorbing. Its transition matrix is almost the same, except that it has 2x2 identity matrix as top left block, reflecting that states with zero and one coins are absorbing states: $$T^\prime(p) = \begin{pmatrix} I_2 & 0 \\ C & D \end{pmatrix}$$

Let $\pi_0 = \begin{pmatrix} 0 & 0 & \ldots & 1 \end{pmatrix}$ be vector of initial probability distribution (state with $N$ coins). To obtain it's evolution after $l$ steps, we have to multiply it by power of transition matrix ($D^l$ vanishes because, as truncated transition matrix, its eigenvalues are all less than $1$ by absolute value): $$\pi_l = \pi_0 T^{\prime \, l}$$

$$ T^{\prime \, l} = \begin{pmatrix} I_2 & 0 \\ \sum_{k=0}^{l-1} D^k C & D^l \end{pmatrix} \xrightarrow{l\rightarrow\infty} T^{\prime \, \infty} = \begin{pmatrix} I_2 & 0 \\ \sum_{k=0}^{\infty} D^k C & 0 \end{pmatrix} $$

So the $f(p)$ in question is given by the $(T^{\prime\,\infty})_{N, 1}$. But because blocks $D$, $C$ are the same as in original transition matrix, we can exploit multiplicativity of $T$: entries of $D^k C$ differ from corresponding ones in block inside $T^k T = T^{k + 1}$ by a term, namely
$$ T^k T = \begin{pmatrix} A^{k} A & 0 \\ \underbrace{\left(\sum_{m=0}^{k-1} D^m C A^{k - 1 - m}\right) A}_{\text{extra term}} + D^k C & D^{k} D \end{pmatrix} $$ \begin{align} (D^k C)_{N, 1} &= \sum_{r = 2}^{N} (D^k)_{N, r} C_{r, 1} = \sum_{r = 1}^{N} (T^k)_{N, r} T_{r, 1} - \underbrace{(T^k)_{N, 1} T_{1, 1}}_\text{from extra term} =\\ &= (T^{k + 1})_{N, 1} - (T^k)_{N, 1} T_{1, 1} = \binom{N}{1} p^{k + 1} (1 - p^{k + 1})^{N - 1} - \binom{N}{1} p^{k} (1 - p^{k})^{N - 1} p =\\ &= N p^{k + 1} \left((1 - p^{k + 1})^{N - 1} - (1 - p^{k})^{N - 1} \right) \end{align}

Now we obtain $f(p)$ in terms of infinite series (apparently it converges thanks to $p^k$ factor) after some rearrangements: \begin{align} f(p) &= \sum_{k=0}^{\infty} N p^{k + 1} \left((1 - p^{k + 1})^{N - 1} - (1 - p^{k})^{N - 1} \right) =\\ % &= \sum_{k=0}^{\infty} N p^{k + 1} (1 - p^{k + 1})^{N - 1} - \sum_{k=1}^{\infty} N p^{k + 1} (1 - p^{k})^{N - 1} =\\ &= \sum_{k=1}^{\infty} N p^{k } (1 - p^{k })^{N - 1} - \sum_{k=1}^{\infty} N p^{k + 1} (1 - p^{k})^{N - 1} =\\ &= \sum_{k=1}^{\infty} N p^{k} (1 - p) (1 - p^{k})^{N - 1} \end{align} which passes cross-check with [1].

Let's expand power and take an infinite sum: \begin{align} f(p) &= \sum_{k=1}^{\infty} N p^{k} (1 - p) \sum_{r=0}^{N-1} \binom{N-1}{r} (-1)^r p^{r k} =\\ &= \sum_{r=0}^{N-1} N \binom{N-1}{r} (-1)^r \frac{1 - p}{1 - p^{r + 1}} \end{align}

This finite sum is related to recurrence relations resulting in large, impractical expressions mentioned in [2]. Indeed, reduction of the fractions in the finite sum to a common denominator would yield a huge rational function.

We can obtain another series representation after rearranging terms in the infinite sum: $$ f(p) = \sum_{k=1}^{\infty} N (1 - p) \sum_{r=0}^{N-1} a_{r + 1} p^{k (r + 1)} $$ where $ a_{r + 1} = \binom{N-1}{r} (-1)^r $

After expanding both sums we get \begin{alignat*}{5} a_1 p^1 &{}+{}& a_1 p^2 &{}+{}& a_1 p^3 &{}+{}& a_1 p^4 &{}+{}& \ldots &{}+ \\ &{}+{}& a_2 p^2 & & &{}+{}& a_2 p^4 &{}+{}& \ldots &{}+ \\ & & &{}+{}& a_3 p^3 & & &{}+{}& \ldots & \\ \end{alignat*} and finally $$ f(p) = N (1 - p) \sum_{k=1}^{\infty} p^k \sum_{\substack{d \vert k \\ d \leq N}} a_d $$ This one looks interesting because it features number theory-like summing over divisors. Perhaps playing around with this expression can provide further insights, but I've reached limits of what I know at this point.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .