9
$\begingroup$

More generally, what is the probability that n n-sided dice will turn up at least one side with the highest number ("n")?

My crude thinking is that each side has 1/6 probability, so that 6 of them (1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6) equals 100% of the time, but this obviously cannot be the case.

2 coins getting one T (say), has 3/4 of the time that at least one will be a T (for every 2 coin tosses). But this is 75% for a 2d2 dice throw, what is the general formula for NdN?

$\endgroup$
3

6 Answers 6

13
$\begingroup$

Probability of a dice not turning up $n$ is $1-1/n$. Probability of not turning up $n$ in any of the $n$ dice is $(1-1/n)^n$. If one subtracts this from $1$, it'll be the probability of at least one $n$ turning up when one throws $n$ dice, i.e.

$$p=1-(1-1/n)^n\rightarrow 1-e^{-1}$$ as $n$ goes to $\infty.$

$\endgroup$
6
  • 2
    $\begingroup$ In the case of trying to roll at least one 6 on 6 6-sided dice rolls (the title question), this gives us approximately a 66.5% chance. $\endgroup$ Commented Jul 27, 2020 at 6:20
  • $\begingroup$ Nathan Chappell has a simpler formula in the comment to the question above. Unfortunately, I don't have the time nor is it immediately apparent how to reduce your formula to his. $\endgroup$
    – Marcos
    Commented Jul 31, 2020 at 18:10
  • 2
    $\begingroup$ Inside of the parantheses is the same if you pay attention closely @Marcos $$\left(1-\frac{1}{n}\right)=\left(\frac{n-1}{n}\right)$$ $\endgroup$
    – gunes
    Commented Jul 31, 2020 at 21:50
  • $\begingroup$ @gunes: Ah, yes, thanks. I came to the same conclusion after writing that comment -- an equivalency I hadn't seen before in algebra. Rather, I hadn't guessed to make 1 = n/n and then re-arrange terms. It's a common algebraic trick, but I'd never had the context to do it for myself. $\endgroup$
    – Marcos
    Commented Aug 1, 2020 at 4:11
  • 1
    $\begingroup$ If P(not turning up $n$)=$p$ for a die, then for $n$ dice, it is $p^n$, which was my second sentence $\endgroup$
    – gunes
    Commented Aug 1, 2020 at 17:02
9
$\begingroup$

The event $A:=$ "at least one die turns up on side $n$" is the complement of the event $B:=$ "all dice turn up on non-$n$ sides".

So $P(A)=1-P(B)$. What's $P(B)$?

All dice are independent, so

$$ P(\text{all $n$ dice turn up on non-$n$ sides}) = P(\text{a single die turns up non-$n$})^n = \bigg(\frac{n-1}{n}\bigg)^n.$$

So

$$ P(A) = 1-\bigg(\frac{n-1}{n}\bigg)^n. $$

Trust but verify. I like to do so in R:

> nn <- 6
> n_sims <- 1e5
> sum(replicate(n_sims,any(sample(1:nn,nn,replace=TRUE)==nn)))/n_sims
[1] 0.66355
> 1-((nn-1)/nn)^nn
[1] 0.665102

Looks good. Try this with other values of nn. Here is a plot:

dice

nn <- 2:100
plot(nn,1-((nn-1)/nn)^nn,type="o",pch=19,ylim=c(1-1/exp(1),1))
abline(h=1-1/exp(1),col="red")

We note how our probability in the limit is

$$ P(A) = 1-\bigg(\frac{n-1}{n}\bigg)^n =1-\bigg(1-\frac{1}{n}\bigg)^n \to 1-\frac{1}{e}\approx 0.6321206 \quad\text{as }n\to\infty, $$

because of an identity involving $e$.

$\endgroup$
8
$\begingroup$

Answers by @StephanKolassa (+1) and @gunes (+1) are both fine. But this problem can be solved with reference to binomial and Poisson distributions as follows:

If $X_n$ is the number of ns seen in $n$ rolls of a fair $n$-sided die, then $X_n \sim \mathsf{Binom}(n, 1/n),$ so that $P(X_n \ge 1) = 1 - P(X_n = 0)= 1-(1-1/n)^n.$

As $n\rightarrow\infty,$ one has $X_n \stackrel{prob}{\rightarrow} Y \sim\mathsf{Pois}(\lambda=1),$ with $P(Y \ge 1) = 1 - P(Y = 0) = 1 - e^{-1}.$

$\endgroup$
2
$\begingroup$

The answer can be arrived at by purely counting the described events as well, although the accepted answer is more elegant. We'll consider the case of the die, and hopefully the generalization is obvious. We'll let the event space be all sequences of numbers from $\{1,2,...,6\}$ of length $6$. Here are a few examples (chosen at random):

3 2 3 5 6 1
1 1 2 5 2 4
1 2 1 1 6 3
4 4 3 3 4 2
6 1 1 6 3 4
6 3 5 4 5 1

The point is, our space has a total of $6^6$ events, and due to independence we suppose that any one of them is as probable as the other (uniformly distributed). We need to count how many sequences have at least one $6$ in them. We partition the space we are counting by how many $6$'s appear, so consider the case that exactly one $6$ appears. How many possible ways can this happen? The six may appear in any position (6 different positions), and when it does the other 5 postions can have any of 5 different symbols (from $\{1,2,...,5\}$). Then the total number of sequences with exactly one $6$ is: $ \binom{6}{1}5^5 $. Similarly for the case where there are exactly two $6$'s: we get that there are exactly $\binom{6}{2}5^4$ such sequences. Now it's time for fun with sums:

$$ \sum_{k=1}^6 \binom{6}{k}5^{6-k} = \sum_{k=0}^6 \binom{6}{k}5^{6-k}1^k - 5^6 = (5+1)^6 - 5^6$$

To get a probability from this count, we divide by the total number of events:

$$ \frac{6^6 - 5^6}{6^6} = 1 - (5/6)^6 = 1 - (1-1/6)^6 $$

I think that this generalizes pretty well, since for any $n$ other than $6$, the exact same arguments holds, only replace each occurrence of $6$ with $n$, and $5$ with $n-1$.

It's also worth noting that this number $5^6 = \binom{6}{0}5^6$ is the contribution of sequences in which no $6$ occurs, and is much easier to calculate (as used in the accepted answer).

$\endgroup$
2
$\begingroup$

I found BruceET's answer interesting, relating to the number of events. An alternative way to approach this problem is to use the correspondence between waiting time and number of events. The use of that would be that the problem will be able to be generalized in some ways more easily.

Viewing the problem as a waiting time problem

This correspondence, as for instance explained/used here and here, is

For the number of dice rolls $m$ and number of hits/events $k$ you get: $$\begin{array}{ccc} \overbrace{P(K \geq k| m)}^{\text{this is what you are looking for}} &=& \overbrace{P(M \leq m|k)}^{\text{we will express this instead}} \\ {\small\text{$\mathbb{P}$ $k$ or more events in $m$ dice rolls}} &=& {\small\text{$\mathbb{P}$ dice rolls below $m$ given $k$ events}} \end{array} $$

In words: the probability to get more than $K \geq k$ events (e.g. $\geq 1$ times rolling 6) within a number of dice rolls $m$ equals the probability to need $m$ or less dice rolls to get $k$ such events.

This approach relates many distributions.

Distribution of                 Distribution of 
Waiting time between events     number of events

Exponential                     Poisson
Erlang/Gamma                    over/under-dispersed Poisson  
Geometric                       Binomial
Negative Binomial               over/under-dispersed Binomial

So in our situation the waiting time is a geometric distribution. The probability that the number of dice rolls $M$ before you roll the first $n$ is less than or equal to $m$ (and given a probability to roll $n$ equals $1/n$) is the following CDF for the geometric distribution:

$$P(M \leq m) = 1-\left(1-\frac{1}{n}\right)^m$$

and we are looking for the situation $m=n$ so you get:

$$P(\text{there will be a $n$ rolled within $n$ rolls}) = P(M \leq n) = 1-\left(1-\frac{1}{n}\right)^n$$

Generalizations, when $n \to \infty$

The first generalization is that for $n \to \infty$ the distribution of the number of events becomes Poisson with factor $\lambda$ and the waiting time becomes an exponential distribution with factor $\lambda$. So the waiting time for rolling an event in the Poisson dice rolling process becomes $(1-e^{-\lambda \times t})$ and with $t=1$ we get the same $\approx 0.632$ result as the other answers. This generalization is not yet so special as it only reproduces the other results, but for the next one I do not see so directly how the generalization could work without thinking about waiting times.

Generalizations, when dices are not fair

You might consider the situation where the dice are not fair. For instance one time you will roll with a dice that has 0.17 probability to roll a 6, and another time you roll a dice that has 0.16 probability to roll a 6. This will mean that the 6's get more clustered around the dice with positive bias, and that the probability to roll a 6 in 6 turns will be less than the $1-1/e$ figure. (it means that based on the average probability of a single roll, say you determined it from a sample of many rolls, you can not determine the probability in many rolls with the same dice, because you need to take into account the correlation of the dice)

So say a dice does not have a constant probability $p = 1/n$, but instead it is drawn from a beta distribution with a mean $\bar{p} = 1/n$ and some shape parameter $\nu$

$$p \sim Beta \left( \alpha = \nu \frac{1}{n}, \beta = \nu \frac{n-1}{n} \right)$$

Then the number of events for a particular dice being rolled $n$ time will be beta binomial distributed. And the probability for 1 or more events will be:

$$P(k \geq 1) = 1 - \frac{B(\alpha, n + \beta)}{B(\alpha, \beta)} = 1 - \frac{B(\nu \frac{1}{n}, n +\nu \frac{n-1}{n})}{B(\nu \frac{1}{n}, n +\nu \frac{n-1}{n})} $$

I can verify computationally that this works...

verification

### compute outcome for rolling a n-sided dice n times
rolldice <- function(n,nu) {
  p <- rbeta(1,nu*1/n,nu*(n-1)/n)
  k <- rbinom(1,n,p)
  out <- (k>0)
  out
}

### compute the average for a sample of dice
meandice <- function(n,nu,reps = 10^4) {
  sum(replicate(reps,rolldice(n,nu)))/reps
}
meandice <- Vectorize((meandice))

### simulate and compute for variance n

set.seed(1)
n <- 6
nu <- 10^seq(-1,3,0.1)
y <- meandice(n,nu)
plot(nu,1-beta(nu*1/n,n+nu*(n-1)/n)/beta(nu*1/n,nu*(n-1)/n), log = "x", xlab = expression(nu), ylab = "fraction of dices",
     main ="comparing simulation (dots) \n with formula based on beta (line)", main.cex = 1, type = "l")
points(nu,y, lty =1, pch = 21, col = "black", bg = "white")

.... But I have no good way to analytically solve the expression for $n \to \infty$.

With the waiting time However, with waiting times, then I can express the the limit of the beta binomial distribution (which is now more like a beta Poisson distribution) with a variance in the exponential factor of the waiting times.

So instead of $1-e^{-1}$ we are looking for $$1- \int e^{-\lambda} p(\lambda) \, \text{d}\, \lambda$$.

Now that integral term is related to the moment generating function (with $t=-1$). So if $\lambda$ is normal distributed with $\mu = 1$ and variance $\sigma^2$ then we should use:

$$ 1-e^{-(1-\sigma^2/2)} \quad \text{instead of} \quad 1-e^{-1}$$

Application

These dice rolls are a toy model. Many real-life problems will have variation and not completely fair dice situations.

For instance, say you wish to study probability that a person might get sick from a virus given some contact time. One could base calculations for this based on some experiments that verify the probability of a transmission (e.g. either some theoretical work, or some lab experiments measuring/determining the number/frequency of transmissions in an entire population over a short duration), and then extrapolate this transmission to an entire month. Say, you find that the transmission is 1 transmission per month per person, then you could conclude that $1-1/e \approx 0.63 \%$ of the population will get sick. However, this might be an overestimation because not everybody might get sick/transmission with the same rate. The percentage will probably lower.

However, this is only true if the variance is very large. For this the distribution of $\lambda$ must be very skewed. Because, although we expressed it as a normal distribution before, negative values are not possible and distributions without negative distributions will typically not have large ratios $\sigma/\mu$, unless they are highly skewed. A situation with high skew is modeled below:

Now we use the MGF for a Bernoulli distribution (the exponent of it), because we modeled the distribution as either $\lambda = 0$ with probability $1-p$ or $\lambda = 1/p$ with probability $p$.

bernoulli example

set.seed(1)
rate = 1
time = 1
CV = 1
### compute outcome for getting sick with variable rate
getsick <- function(rate,CV=0.1,time=1) {
            ### truncating changes sd and mean but not co much if CV is small
  p <- 1/(CV^2+1)
  lambda <- rbinom(1,1,p)/(p)*rate
  k <- rpois(1,lambda*time)
  out <- (k>0)
  out
}


CV <- seq(0,2,0.1)
plot(-1,-1, xlim = c(0,2), ylim = c(0,1), xlab = "coefficient of variance", ylab = "fraction",
     cex.main = 1, main = "if rates are bernouilli distributed \n fraction p with lambda/p and 1-p with 0")
for (cv in CV) {
  points(cv,sum(replicate(10^4,getsick(rate=1,cv, time = 1)))/10^4)
}

p <- 1/(CV^2+1)
lines(CV,1-(1-p)-p*exp(-1/p),col=1)
lines(CV,p, col = 2, lty = 2)

legend(2,1, c("simulation", "computed", "percent of subsceptible population"),
       col = c(1,1,2), lty = c(NA,1,2), pch = c(1,NA,NA),xjust =1, cex = 0.7)

The consequence is. Say you have high $n$ and have no possibilities to observe $n$ dice rolls (e.g. it takes to long), and instead you screen the number of $n$ rolls only for a short time for many different dice. Then you could compute the number of dices that did roll a number $n$ during this short time and based on that compute what would happen for $n$ rolls. But you would not be knowing how much the events correlate within the dice. It could be that you are dealing with a high probability in a small group of dice, instead of an evenly distributed probability among all dice.

This 'error' (or you could say simplification) relates to the situation with COVID-19 where the idea goes around that we need 60% of the people immune in order to reach herd immunity. However, that may not be the case. The current infection rate is determined for only a small group of people, it can be that this is only an indication for the infectiousness among a small group of people.

$\endgroup$
7
  • $\begingroup$ I must admit that my last case is actually very trivial and does not this larger framework and tricks with moment generating functions, it can be done easier by just rescaling the population. $\endgroup$ Commented Jul 31, 2020 at 16:58
  • $\begingroup$ ...Except, since the number of rolls is finite, one can enumerate all possible outcomes. This is n^n for an n-sided die. Once one has done this it's a matter of counting which rolls have a #n. $\endgroup$
    – Marcos
    Commented Jul 31, 2020 at 17:05
  • $\begingroup$ @Marcos my approach is for people that do not like counting (e.g. imagine $n$ would be equal to a million or larger, it is still finite but good luck counting). And also, how are you gonna count if the dices are not fair but instead follow some continuous distribution? -------- That said, I agree your specific question can be answered easily. The method here is just an additional view that can be useful for more complex problems. $\endgroup$ Commented Jul 31, 2020 at 17:20
  • $\begingroup$ True, it requries a "meta-numeric" formula for counting the digits of numbers (as symbols rather than quantities) themselves. $\endgroup$
    – Marcos
    Commented Jul 31, 2020 at 17:28
  • $\begingroup$ @Marcos I am not sure what you mean by counting now. I consider this problem not as a counting problem when $n$ get's large. Well, you can go count things and maybe create expressions that summarize the counting, but you may not always get (easily) to a solution. Flipping the point of view from counting the probabilities of a given number of events for a given number of rolls, to a given number of rolls (waiting time) for a given number of events, can be a useful tool to greatly simplify the problem (I agree for this specific problem, which is not so difficult it is not so much necessary) $\endgroup$ Commented Jul 31, 2020 at 17:42
1
$\begingroup$

Simplify and then extend. Start with a coin. A coin is a die with 2 sides (S=2).

The exhaustive probability space is

T | H

Two possibilities. One satisfies the condition of all heads. So your odds of all heads with one coin (n=1) are 1/2.

So try two coins (n=2). All outcomes:

TT | TH | HT | HH

Four possibilities. Only one matches your criteria. It is worth noting that the probability of one being heads and the other being tails is 2/4 because two possibilities of the four match your criteria. But there is only one way to get all heads.

Add one more coin (n=3)...

TTT | THT | HTT | HHT

TTH | THH | HTH | HHH

8 possibilities. Only one fits the criteria - so 1/8 chances of all heads.

The pattern is (1/S)^n or (1/2)^3.

For dice S = 6, and we have 6 of them.

Probability of getting a 6 on any given roll is 1/6. Rolls are independent events. So using 2 dice getting all 6's is (1/6)*(1/6) or 1/36.

(1/6)^6 is about 1 in 46,656

$\endgroup$
1
  • $\begingroup$ I was actually just looking for the probability of ANY heads showing up, given as many rolls as there are faces on the die. But, cheers mate, I got my answer. $\endgroup$
    – Marcos
    Commented Jul 28, 2020 at 0:22

Not the answer you're looking for? Browse other questions tagged or ask your own question.