76
$\begingroup$

I have been looking at the birthday problem (http://en.wikipedia.org/wiki/Birthday_problem) and I am trying to figure out what the probability of 3 people sharing a birthday in a room of 30 people is. (Instead of 2).

I thought I understood the problem but I guess not since I have no idea how to do it with 3.

$\endgroup$
4
  • 2
    $\begingroup$ Do we completely disregard the fact that people are more likely to be born on certain months than others? Making this slightly more likely? $\endgroup$
    – Justin
    Commented Mar 9, 2011 at 4:27
  • 4
    $\begingroup$ @Fdart17: In Exercise 13.7 of The Cauchy-Schwarz Master Class, J. Michael Steele uses Schur convexity to show that uniform probabilities are least likely to give birthday matches. So you are right, non-uniform birthdays give us a better chance of a match. $\endgroup$
    – user940
    Commented Mar 9, 2011 at 16:32
  • $\begingroup$ Does the problem get simpler if you only want the probability that at least three people have the smae birthday? Does anyone have a solution for this problem? $\endgroup$
    – user59238
    Commented Jan 22, 2013 at 15:54
  • $\begingroup$ @user59238 : see math.stackexchange.com/questions/1544460/… $\endgroup$
    – Watson
    Commented Mar 1 at 19:18

9 Answers 9

73
$\begingroup$

The birthday problem with 2 people is quite easy because finding the probability of the complementary event "all birthdays distinct" is straightforward. For 3 people, the complementary event includes "all birthdays distinct", "one pair and the rest distinct", "two pairs and the rest distinct", etc. To find the exact value is pretty complicated.

The Poisson approximation is pretty good, though. Imagine checking every triple and calling it a "success" if all three have the same birthdays. The total number of successes is approximately Poisson with mean value ${30 \choose 3}/365^2$. Here $30\choose 3$ is the number of triples, and $1/365^2$ is the chance that any particular triple is a success. The probability of getting at least one success is obtained from the Poisson distribution: $$ P(\mbox{ at least one triple birthday with 30 people})\approx 1-\exp(-{30 \choose 3}/365^2)=.0300. $$

You can modify this formula for other values, changing either 30 or 3. For instance, $$ P(\mbox{ at least one triple birthday with 100 people})\approx 1-\exp(-{100 \choose 3}/365^2)=.7029,$$ $$ P(\mbox{ at least one double birthday with 25 people })\approx 1-\exp(-{25 \choose 2}/365)=.5604.$$

Poisson approximation is very useful in probability, not only for birthday problems!

$\endgroup$
15
  • 1
    $\begingroup$ " and 1/365^2 is the chance that any particular triple is a success." - Hmm, I don't understand that part. Why squared? $\endgroup$
    – irl_irl
    Commented Mar 9, 2011 at 16:22
  • 20
    $\begingroup$ Take the three random people one at a time. The first guy has some birthday, say March 9. The chance that the second guy has the same birthday is 1/365, and the chance that the third guy has the same birthday is also 1/365. Multiplying these gives 1/365^2. $\endgroup$
    – user940
    Commented Mar 9, 2011 at 16:25
  • 1
    $\begingroup$ Poisson approximation is good but your answer has a bug. You need to do 1 - u * exp(-u) instead of 1 - exp(-u). Remember Poisson distribution is P(k) = u * exp(-u) / k! Here we take k=1 as we are interested in finding just one pair. Making this change gives answer 0.6394, pretty close to true value. $\endgroup$ Commented Jul 26, 2015 at 6:43
  • 2
    $\begingroup$ You are over-counting because triples of birthdays are positively correlated and attract each other. How much do they attract? Given a birthday "B" triple, each of the other $n-3$ people would form $3$ more triplets if their birthday was $B$. Divide by $2$ to account for each pair of influences being counted twice to get $(100-3)p/2$ expected affected pairs. Using $\frac {p^2} {1+\frac{(100-3)p}2}$ instead of $p^2$ in Poisson formula, we get $P\approx 0.5801$. But this approximation undercounts since clumps of people sharing a birthday repel. Average the two to get 0.6415 for a $<0.7\%$ error. $\endgroup$
    – A.S.
    Commented Dec 13, 2015 at 8:37
  • 1
    $\begingroup$ (late comment) probabilities are not independent when selecting triplets (N Choose K for K = 3) if you've randomly gone through half the triplets and had no matches, any triplet composed of a pair that didn't match and a third element will also not triple match @A.S. $\endgroup$
    – vish
    Commented Jul 13, 2021 at 3:26
37
$\begingroup$

An exact formula can be found in Anirban DasGupta, The matching, birthday and the strong birthday problem: a contemporary review, Journal of Statistical Planning and Inference 130 (2005), 377-389. This paper claims that if $W$ is the number of triplets of people having the same birthday, $m$ is the number of days in the year, and $n$ is the number of people, then

$$ P(W \ge 1) = 1 - \sum_{i=0}^{\lfloor n/2 \rfloor} {m! n! \over i! (n-2i)! (m-n+i)! 2^i m^n} $$

No derivation or source is given; I think the idea is that the term corresponding to $i$ is the probability that there are $i$ birthdays shared by 2 people each and $n-2i$ birthdays with one person each.

In particular, if $m = 365, n = 30$ this formula gives $0.0285$, not far from Byron's approximation.

$\endgroup$
4
  • 4
    $\begingroup$ Here is a good approximation based on two Poissons - one that overcounts and one that undercounts: $$P(W=0)=\frac 1 2\left(\exp\left(-T\right)+\exp\left(-\frac T{1+\frac{3(n-3)}{2m}} \right)\right)$$ where $T=(\frac 1 m)^2\binom n 3$ is expected number of triplets sharing a birthday and the second exponent is expected number of birth-days that are shared by at least $3$ people. The principle of averaging probabilities of $P(T)$ and $P(T/(1+\alpha))$ can be extended to other events, not just $W=0$. Yields $\approx 0.028537$ in this case. Change $3$ to $M$ for $M-plets$ of birthdays. $\endgroup$
    – A.S.
    Commented Dec 13, 2015 at 16:54
  • $\begingroup$ Derivation of the formula: $$P(W \ge 1) = 1 - \sum_{i=0}^{\lfloor n/2 \rfloor} { {m \choose i} {m - i \choose n - 2i} n! \over 2^i m^n}$$ $\endgroup$
    – Nimyz
    Commented Jan 24, 2021 at 15:35
  • $\begingroup$ There are ${m \choose i}$ ways to choose the "shared by 2" birthdays and ${m - i \choose n - 2i}$ ways to choose the "single" birthdays. And then we can still choose the orderings in ${ n! \over 2^i }$ ways (the factor of $2^i$ is to avoid double counting the permutations of the shared birthday tuples). This has to be divided by the total amount of possibilities $m^n$. $\endgroup$
    – Nimyz
    Commented Jan 24, 2021 at 15:43
  • 1
    $\begingroup$ For anyone wanting to calculate the probability for a given n, like n = 50, you can change n here and run: wolframalpha.com/input/… $\endgroup$ Commented Nov 25, 2021 at 20:43
8
$\begingroup$

As being pointed out by Micheal Lugo the formulation given by Anirban DasGupta is a exact answer for this problem, however a formal proof is needed. I have found and verified a solution by Doctor Rick from Math Forum, below is the link

Link

His approach is to partition the sample space as following:

   1.   none share a birthday
   2.   one pair shares a birthday
   3.   two pairs share different birthdays
   4.   three pairs share different birthdays
   :
 1+N/2. N/2 pairs share different birthdays
 2+N/2. three or more share a birthday

Then he points out a clever way to count for each partition by picking different birthday for each pair of person. I have tried and arrived with the same formulation as Anirban DasGupta's. For more detail please take a look at the link above!

$\endgroup$
0
7
$\begingroup$

I'm a bit skeptical of this answer. Here is a formula that works.

It probably helps to explain Dasgupta's formula from Michael Lugo's answer first.

Say that a map $f : [m]\to [n]$ is $k$-almost injective if $|f^{-1}(j)|\le k$ for all $j\in [n]$. Counting injective maps is easy, there are

$$ I(1,m,n) := m!\binom{n}{m} $$

of them. You just pick the range and then a bijection to it. This gives right away the standard birthday collision probability for $m$ people and years of length $n$

$$ 1 - n^{-m}I(1,m,n) $$

One gets the generalized birthday probability from $I(k,m,n)$ in the same way, so we can just think about $I(k,m,n)$.

How would we go about counting $2$-injective maps? The same idea as before works. This time, we pick $c$ pairs that will have colliding images, injectively map these into $[n]$, then injectively map the rest to a set of size $n-c$. So we get

$$ I(2,m,n) = \sum_{c=0}^{\lfloor m/2\rfloor}\frac{1}{c!} \left(\prod_{j=0}^{c-1}\binom{m - 2j}{2}\right) I(1,c,n)I(1,m-2c,n-c) $$

This is equivalent to Dasgupta's formula, but it is easier to see the induction.

If we want to get $I(k,m,n)$ in general, we have

$$ I(k,m,n) = \sum_{c=0}^{\lfloor m/k\rfloor}\frac{1}{c!} \left(\prod_{j=0}^{c-1}\binom{m - kj}{k}\right) I(1,c,n)I(k-1,m-kc,n-c) $$

$\endgroup$
3
$\begingroup$

My own research led to the following result...

Knowing that there are $A$ days in the year (typically $A=365$), the probability $P(A, M, n)$ that at least $n$ children have their birthday the same day within a class of $M$ children is:

$$\boxed{ P(A, M,n) = 1 - \dfrac{ K_n(A, M) }{ A^M } }$$

where $K_n(A, M)$ represents the number of configurations in which one cannot find $n$ children (or more) having their birthday the same day, and can be computed by recurrence as follows:

$$\forall n\ge 2, \quad \boxed{ K_{n+1}(A, M) = \sum_{0\le k\le \left\lfloor{\frac{M}{n}}\right\rfloor} \dfrac{ \binom{A}{k} \; (M)_{nk} \; K_n(A-k, M-nk)}{ (n!)^k} }$$

with the following initialization: $\boxed{ K_2(A,M)=(A)_M }$

and where $(n)_k$ stands for the decreasing factorial : $(n)_k = n(n-1)...(n-k+1)$

Numerically:

Within a class of $M=30$ children, knowing that the year counts $A=365$ days...

The probability that at least $n=2$ children have their birthday the same day is:

$$P(365,30,2)\simeq 70,6\%$$

The probability that at least $n=3$ children have their birthday the same day is:

$$P(365,30,3)\simeq 2,85\%$$

The probability that at least $n=4$ children have their birthday the same day is:

$$P(365,30,4)\simeq 0,0532\%$$

Nicolas

$\endgroup$
2
$\begingroup$

Just like to point out that Trazom's answer is incorrect for the general case - the sets being counted in the outer sum overlap. I don't have enough reputation to comment. I wrote a blog post about the general case here : https://swarbrickjones.wordpress.com/2016/05/08/the-birthday-problem-ii-three-people-or-more/

$\endgroup$
0
$\begingroup$

There is another approximation based on a Poisson distribution. I was using this method in the 1990s (I have found my implementation in Java), but unfortunately do not remember where I first read about it. It is also described in https://stats.stackexchange.com/questions/1308/extending-the-birthday-paradox-to-more-than-2-people.

Let $b$ be the number of days in the year, $n$ the number of people in the room, and suppose you want to know the probability that at least $k + 1$ people share a birthday. (In the original question above, $b = 365$ and $k = 2.$)

Let $X_i$ be the number of people who have birthdays on day number $i$ for $1 \leq i \leq b.$ We approximate the joint distribution of $(X_1, \ldots, X_b)$ by assuming that each $X_i$ is a Poisson variable with expected value $n/b,$ and assuming that these variables are all independent.

The probability that no $k + 1$ people all have their birthdays on day $i$ is $P(X_i \leq k) = F(k),$ where $F$ is the CDF of $\mathrm{Poisson}(n/b).$ The probability that this happens on all $b$ days of the year, that is, there is no day on which more than $k$ people have a birthday, is $(F(K))^b.$

The estimated probability that there is a day when $k+1$ or more people have a birthday is therefore $1 - (F(K))^b.$

Setting $b=365$ and trying this out on a few known cases from https://oeis.org/A014088:

\begin{array}{ccccc} k & n & n/b & F(k) & (F(k))^b \\ 1 & 22 & 0.0602740 & 0.99825489 & 0.5286 \\ 1 & 23 & 0.0630137 & 0.99809610 & 0.4988 \\ 2 & 87 & 0.2383562 & 0.99811045 & 0.5014 \\ 2 & 88 & 0.2410959 & 0.99804850 & 0.4902 \\ 3 & 186 & 0.5095890 & 0.99812428 & 0.5039 \\ 3 & 187 & 0.5123288 & 0.99808774 & 0.4973 \\ 4 & 312 & 0.8547945 & 0.99812038 & 0.5032 \\ 4 & 313 & 0.8575342 & 0.99809433 & 0.4985 \\ \end{array}

This predicts (correctly) that you need $23$ people to have at least a $50\%$ chance of at least one set of two people with the same birthday, $88$ people to have at least a $50\%$ chance of at least one set of three people with the same birthday, $187$ people to have at least a $50\%$ chance of at least one set of four people with the same birthday, $313$ people to have at least a $50\%$ chance of at least one set of five people with the same birthday.

As another example, suppose you have $100$ people in the room. Then this approximation gives $(F(2))^{365} \approx 0.3600$, and therefore the probability of three or more people all with the same birthday is approximately $0.6400.$ Wolfram Alpha gives the probability as $0.6459$. Contrast this with the accepted answer, which estimates the probability at $0.7029.$

On the other hand, for small numbers of people in the room, this method overestimates the chance of $k+1$ people with the same birthday. For example, for the probability of three or more people with the same birthday out of $30$ people, this method gives $0.0313$, the accepted answer gives $0.0300,$ and Wolfram Alpha gives $0.0285.$

The individual $X_i$ are not actually Poisson and of course they are not actually independent. That is what makes this an approximation rather than an exact method.

$\endgroup$
-2
$\begingroup$

Anyone looking for generalized birthday problem i.e. How many people are required such that M of them share same birthday with certain probability.
This link explain various method for calculating probability of generalized birthday problem.

http://mathworld.wolfram.com/BirthdayProblem.html Also this paper talk more about various kind of coincidences we face ib life, interesting read. https://www.stat.berkeley.edu/~aldous/157/Papers/diaconis_mosteller.pdf

$\endgroup$
1
  • $\begingroup$ "Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline." $\endgroup$ Commented Sep 10, 2017 at 5:35
-3
$\begingroup$

I am looking at this question and the complicated answers and it's confusing me. Supposing I want to solve in a group of 100 people. what is the probability that at least 3 people share a birthday. So I start from very basic - if there are 3 people, the probability of them sharing a birthday is $$\frac{1}{365} *\frac{1}{365}*\frac{1}{365}*(365)=\frac{1}{(365)^2}$$ 1/365 the prob. of 1 person having a birthday on a particular day multiplied by each person's probability multiplied by the total no. of days.
Similarly if there are 4 people, the probability of at least 3 of them sharing a birthday would be $\frac{1}{(365)^2}*^4C_3 $. and similarly for x people, $\frac{1}{(365)^2}*^XC_3 $

I am trying to find the fault in this logic.

$\endgroup$
3
  • 2
    $\begingroup$ Welcome to Math.SE! You have posted this as an Answer, but it really seems to be more of a new Question. If you wanted to understand why your computation is wrong, one approach would be trying to apply it to a simplified problem, such as "What is the probability that three people share the day of the week on which they were born?" But don't post as an Answer that the other answers are confusing to you. $\endgroup$
    – hardmath
    Commented Apr 8, 2016 at 18:50
  • $\begingroup$ No, what I meant is that I feel this is the answer, but I am not a 100% sure of it. It is just my attempt of answering the question but i would welcome critique on it. $\endgroup$
    – yavvee
    Commented Apr 8, 2016 at 18:56
  • 3
    $\begingroup$ Let me briefly elaborate on how you can verify your approach is wrong. Note that a probability should always fall between $0$ and $1$, but as $X$ grows, your expression exceeds $1$. This is easily seen when we consider the days of the week simplification. The chance that three people are born on the same day of the week is indeed $\frac{1}{7^2}$, but the chance that three out of eight people share a day of week for birth cannot be $\frac{\binom{8}{3}}{7^2}$ because that exceeds $1$. $\endgroup$
    – hardmath
    Commented Apr 10, 2016 at 0:22

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .