1
$\begingroup$

Here's another formulation of a birthday problem: given $n$ people, and $m$ days, how to calculate the expected number of people having a birthday on any single collision day, i.e., a day where two or more people have birthdays?

UPD: to clarify: given a date, which is a birthday of at least two people (= collision date), what is the expected number of people who share this birthdate?

$\endgroup$
6
  • $\begingroup$ It is not clear what number you want the expected value of. Do you mean the maximum number of people having the same birthday, the maximum being taken over all possible birthdays? The expected number for a randomly chosen day is easy to compute (and fairly small in typical examples): it is the number of days divided by the number of people. If you only want to consider collision days (so an expected value would be expected to be at least $2$), then it is a problem that such days are not guaranteed to exist. $\endgroup$ Commented May 19, 2017 at 16:36
  • $\begingroup$ @mck Are you asking for the average number of people who share a birthday (could be any day)? As in, if we tallied up the birthday frequencies, what is the average frequency among days that have $>1$ people? $\endgroup$ Commented May 19, 2017 at 16:39
  • $\begingroup$ no, I'm looking for expected number of people involved in a birthday "collision". I.e., given a date, which is a birthday of at least two people (= collision date), what is the expected number of people who share this birthdate? $\endgroup$
    – kck
    Commented May 19, 2017 at 16:40
  • $\begingroup$ @MarcusStuhr yep! $\endgroup$
    – kck
    Commented May 19, 2017 at 16:41
  • $\begingroup$ @mck As I said, the fact that such days do not always exist makes it problematic to assign a contribution to the expected value from such configurations. You could contribute the value $0$, or maybe $1$ (since some days do have $1$ person), but it is rather arbitrary to do that. Simply ignoring a part of your probability space is weird for expected values. $\endgroup$ Commented May 19, 2017 at 16:42

1 Answer 1

1
$\begingroup$

Using results in the related question an earlier question and answer

  • The expected number of people who share a birthday with somebody else is $n-n\left(1-\frac1m\right)^{n-1}$

  • The expected number of days where two or more people have birthdays is $m - m \left(1-\frac1m\right)^n - n \left(1-\frac1m\right)^{n-1}$

and so dividing the first by the second may look like

$$\frac{n-n\left(1-\frac1m\right)^{n-1} }{ m - m \left(1-\frac1m\right)^n - n \left(1-\frac1m\right)^{n-1} }$$

though note that this approach gives a greater weight to cases of distributions of people among more birthdays

As an example, with $m=2$ and $n=4$ this gives $\frac{28}{11}$. If you consider the sixteen equally probable distributions of birthdays for four people among two days

Day1  Day2
ABCD  - 
ABC   D 
ABD   C
AB    CD
ACD   B
AC    BD
AD    BC
A     BCD
BCD   A
BC    AD
BD    AC
B     ACD
CD    AB
C     ABD
D     ABC
-     ABCD

there are $2$ cases of four people sharing a birthday, $8$ cases of three people sharing a birthday, $12$ cases of two people sharing (as well as $8$ of one and $2$ of zero) making the average number of people per shared birthday $\frac{2\times 4+8 \times3 +12\times 2}{2+8+12}=\frac{28}{11}$ as predicted.

A different approach could say the average should be calculated as the average of $2$ cases with the average being $4$, $8$ cases with the average being $3$ and $6$ cases with the average being $2$, giving a result of $\frac{2\times 4+8 \times3 +6\times 2}{2+8+6}=\frac{11}{4}$

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .