0
$\begingroup$

Disclaimer: I've seen posts with good answers for the case of "at least $2$ people", or "exactly $2$ people". Posts with "at least $k$ people" usually suppose that the mutual birthday is a fixed day (i.e. Jan 2nd) and apply the binomial distribution. This post has to do with the probability of $2$ people sharing the same (random) birthday in a group of $n$ people.


Attempt:

The probability of at least $2$ people having the same birthday in a group of $n$ people is the complement of the probability of everyone having a different birthday. That is: $$ p(n,k\geq 2)=1-\frac{365 \cdot 364 \cdot \dots \cdot (365-n+1)}{365^n} $$ Now, let's suppose that we want to find $p(n \geq 3)$. By making the assumption that $2$ people have already the same birthday, we can treat these two as one person. So the probability of at least $2$ people sharing the same birthday in a group of $n-1$ people is: $$ p(n-1,\geq 2)=1-\frac{365 \cdot 364 \cdot \dots \cdot (365-n+2)}{365^{n-1}} $$ Then, I'm thinking of finding the probability $p(n,k=2)$ of exactly $2$ people sharing the same birthday in a group of $n$ people and somehow calculate: $$ p(n,k\geq 3)=p(n-1,k \geq 2 \bigg| n,k=2) $$ but I'm possibly mistaken. Any thoughts?

$\endgroup$
1
  • $\begingroup$ The reason people tend to not write it up is that it gets very messy for general $k,n$. Even for $3,n$, it's messy because you need to consider the possibility that no two share a birthday, exactly two share a birthday, exactly two pair of people share birthdays (though not the same birthday) and so on. Not a difficult computation on a machine, but hard to do with pencil and paper. $\endgroup$
    – lulu
    Commented Feb 11, 2019 at 16:58

2 Answers 2

1
$\begingroup$

As far as I can tell, your methods do not work and cannot be salvaged.

This is a hard problem that cannot be solved by conventional means. To find the probability that at least three people have a common birthday, let us instead compute the probability that no three people share a birthday. This is equal to $$ \frac{n![x^n](1+x+x^2/2)^{365}}{365^n}.\tag 1 $$ Here, $[x^n]f(x)$ is the coefficient of $x^n$ in the polynomial $f(x)$. You want one minus the above. Another way to write this is $$ \sum_{a_1,a_2,\dots,a_{365}}\frac1{365^n}\frac{n!}{a_1!a_2!\cdots a_{365}!}.\tag 2 $$ where the sum ranges over all vectors $(a_1,a_2,\dots,a_{365})$ of integers between $0$ and $2$ whose sum is $n$. This works because each vector specifies a valid distribution of birthdays where no birthday repeats three times or more, and the multinomial coefficient gives the number of ordered selections of birthdays which have that distribution, and each is then multiplied by the probability $(1/365)^n$ of that ordered selection occurring. You can check that when you expand out $(1)$ and collect the coefficient of $x^{n}$, you get exactly $(2)$.


If you are okay with an approximate result, the expected number of triplets of people who share a common birthday is $\lambda =\binom{n}{3}\cdot \frac1{365}^2$, and the number of triplets with a common birthday is approximately Poisson with parameter $\lambda$. Therefore, the probability some triplet share a birthday is approximately $$ 1-e^{-\binom{n}3/365^2}. $$

$\endgroup$
2
  • $\begingroup$ Could you elaborate on how did you derive the first result, aka the probability that no three people share a birthday? $\endgroup$
    – Paris
    Commented Feb 11, 2019 at 22:42
  • $\begingroup$ @LoneBone See edit. $\endgroup$ Commented Feb 11, 2019 at 23:50
1
$\begingroup$

While there's no simple formula for the exact expression given in Mike Earnest's equation (2), there are efficient ways of evaluating it for any given $\ n\ $, one of which has been implemented the packagepmultinom of the R statistical programming language. Here's an R program for comparing the exact probabilities of at least three people having a common birthday with the Poisson approximation for $ n=90\ $ to $\ 100\ $:

library(pmultinom)
us<-1:365
ps<-1:365
for (i in 1:365){
  us[i]<-2
  ps[i]<-365^(-1)
}
for(n in 90:100){
  a = 1-pmultinom(upper=us,size=n,probs=ps,method="exact")
  b=1-exp(-n*(n-1)*(n-2)/(6*365^2))
  v<-c(n,a,b)
  print(v)
}

Running it online here produces the following results: $$ \begin{matrix} n &\mbox{exact} & \mbox{Poisson approx.}\\ 90 & 0.5341956 & 0.5859698\\ 91 & 0.5456984 & 0.5982312\\ 92 & 0.5571482 & 0.6103927\\ 93 & 0.5685366 & 0.6224440\\ 94 & 0.5798554 & 0.6343752\\ 95 & 0.5910964 & 0.6461763\\ 96 & 0.6022517 & 0.6578381\\ 97 & 0.6133135 & 0.6693514\\ 98 & 0.6242745 & 0.6807075\\ 99 & 0.6351272 & 0.6918979\\ 100 & 0.6458645 & 0.7029148 \end{matrix} $$ So for this range of values the Poisson approximation is about $10\%$ too latge.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .