4
$\begingroup$

In a room of $n$ people, what is the probability that at least $k \in \{1, 2, \dots, n\}$ people all share a common birthday?

For example, what is the probability that at least 5 people all share the same birthday in a room of a 1000 people.

I have found a formula on Wikipedia for solving this problem when $k = 2$ but I would like to be able to solve for $n > 365$ and $k > 2$.

I know that $p(k \text{ people share the same birthday}) = 1$ for $k = \lceil\frac{n}{365}\rceil$ but I don't know how to solve beyond this point. I assume there are $365^n$ different ways the birthdays of $n$ people can be distributed within a year hence I am struggling to reach an answer using a combinatorial approach.

I have researched this problem online but most answers stop at the probability of one or more shared birthday in a group of 60 people max.

$\endgroup$

2 Answers 2

6
$\begingroup$

Unfortunately the reason no one wants to do this calculation is that it's messy. This is $1$ minus the probability that every birthday is shared by at most $k-1$ people. To compute this, let $f_{n, k}$ be the number of ways $n$ people can have birthdays such that every birthday is shared by at most $k-1$ people, so that we want to compute $1 - \frac{f_{n, k}}{365^n}$. General results imply that $f$ has an exponential generating function

$$\boxed{ \sum_n f_{n, k} \frac{t^n}{n!} = \left( \sum_{i=0}^{k-1} \frac{t^i}{i!} \right)^{365} }.$$

For example, to find the probability for $k = 5, n = 1000$ as you ask for we need the coefficient of $t^{1000}$ in $\left( 1 + t + \frac{t^2}{2} + \frac{t^3}{6} + \frac{t^4}{24} \right)^{365}$. As far as I know, this is about as good as it gets for the exact count. You should be able to expand this out in a good computer algebra system but it's a little too big for WolframAlpha.

So instead of asking for the exact count let's ask for an approximation. There is a Poisson approximation available here, generalizing the one discussed on Wikipedia: the number of people having a given birthday is approximately Poisson with mean $\frac{n}{365}$, and moreover for different birthdays these Poisson random variables are approximately independent. This means that the probability that every birthday is shared by at most $k-1$ people is approximately

$$\frac{f_{n,k}}{365^n} \approx \left( \sum_{i=0}^{k-1} \frac{ \left( \frac{n}{365} \right)^i}{i!} e^{- \frac{n}{365}} \right)^{365} = \boxed{ e^{-n} \left( \sum_{i=0}^{k-1} \frac{ n^i}{365^i i!} \right)^{365} }.$$

This is actually much better; in particular for $k = 5, n = 1000$ this is now a computation we can easily ask WolframAlpha to do. It gives

$$\frac{f_{1000, 5}}{365^{1000}} \approx e^{-1000} \left( 1 + \frac{1000}{365} + \frac{1000^2}{365^2 2!} + \frac{1000^3}{365^3 3!} + \frac{1000^4}{365^4 4!} \right)^{365} \approx \boxed{ 3 \times 10^{-25} }$$

so the probability that in a group of $1000$ people at least $5$ share the same birthday is very close to $1$.

To get a better sense of how these numbers behave we could ask: for a given $n$, such as $n = 1000$, about what value of $k$ do we need for this probability to no longer be very small like this? In other words, what can we say about the distribution of the largest cluster of people sharing the same birthday? In the Poisson approximation this is equivalent to asking: what is the distribution of the maximum of $365$ iid Poisson random variables with mean $\frac{n}{365}$? It's known that this distribution clusters closely around

$$\frac{\log 365}{W \left( \frac{\log 365}{e \frac{n}{365} } \right)}$$

where $W$ is the Lambert W function, which is complicated-looking but for large $n$ grows roughly like $\frac{en}{365}$ and in any case for $n = 1000$ evaluates to about $\boxed{12}$. This could be tested experimentally by generating some random birthday distributions, as well as by substituting $k = 10, 11, 12, 13, 14$ into the Poisson approximation above.

Edit: An alternative argument which produces a rigorous upper bound starting from the generating function is the following. Write $p(t) = \sum_{i=0}^{k-1} \frac{t^i}{i!}$. Applying the saddle point bound to the generating function gives

$$\frac{f_{n, k}}{n!} \le \frac{p(r)^{365}}{r^n}$$

for any $r > 0$. The optimal value of $r$ is obtained by setting the logarithmic derivative of the RHS to zero; this gives $365 \frac{p'(r)}{p(r)} - \frac{n}{r} = 0$, which is complicated, but if we're willing to approximate $\frac{p'(r)}{p(r)} \approx 1$ then this gives $r \approx \frac{n}{365}$ which is much easier to work with. We get a bound

$$\frac{f_{n, k}}{n!} \le \frac{\left( \sum_{i=0}^{k-1} \frac{ \left( \frac{n}{365} \right)^i}{i!} \right)^{365}}{ \left( \frac{n}{365} \right)^n }$$

and rearranging this, then using the crude Stirling inequality $n! \le en \left( \frac{n}{e} \right)^n$, we get a rigorous upper bound

$$\boxed{ \frac{f_{n, k}}{365^n} \le \frac{n}{e^{n-1}} \left( \sum_{i=0}^{k-1} \frac{n^i}{365^i i!} \right)^{365} }$$

which is very similar to the above bound, and we could improve it a bit by using a more accurate form of Stirling's approximation and a bit more by estimating $r$ more accurately, and a bit more by using a more precise version of the saddle point bound. In any case this bound gives, similar to the above,

$$\frac{f_{1000, 5}}{365^{1000}} \le \frac{1000}{e^{999}} \left( 1 + \frac{1000}{365} + \frac{1000^2}{365^2 2!} + \frac{1000^3}{365^3 3!} + \frac{1000^4}{365^4 4!} \right)^{365} \approx \boxed{ 9 \times 10^{-22} }.$$

$\endgroup$
-1
$\begingroup$

if $n>365$ then by pigeonhole principle at least 2 share the same birthday.

pigeonholes = 365, pigeons = $n$.

in fact if $n=365k+1$ then at least $k+1$ people share the same birthday.

for $n\leq 365$ use the probability that no one was born on the same day.

so we have:

$q=1/365*1/364*...*1/(365-n+1)$

then $p=1-q$ is the probability for at least 2 people sharing same birthday.

$\endgroup$
1
  • 1
    $\begingroup$ This is not an answer to the OP's question. $\endgroup$ Commented Oct 16, 2022 at 6:27

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .