Coming from Blitzstein's book:
In the birthday problem, we assumed that all 365 days of the year are equally likely (and excluded February 29). In reality, some days are slightly more likely as birthdays than others. For example, scientists have long struggled to understand why more babies are born 9 months after a holiday. Let $\textbf{p}= (p_1, p_2, ..., p_{365})$ be the vector of birthday probabilities, with $p_j$ the probability of being born on the $j$th day of the year (February 29 is still excluded, with no offense intended to Leap Dayers). The $k$th elementary symmetric polynomial in the variables $x_1,..., x_n$ is defined by
$e_k(x_1, ..., x_n) = \sum_{1 \leq j_1 < j_n < ... < j_k \leq n}{x_{j_i} ... x_{j_k}}$.
This just says to add up all of the $n \choose k$ terms we can get by choosing and multiplying $k$ of the variables. For example, $e_1(x_1, x_2, x_3) = x_1 + x_2 + x_3$ and $e_2(x_1, x_2, x_3) = x_1x_2 + x_1x_3 + x_2x_3$.
Now let $k \geq 2$ be the number of people.
a) Find a simple expression for the probability there is at least one birthday match, in terms of $\textbf{p}$ and an elementary symmetric polynomial.
I am confused how to do this problem; I've tried a couple approaches. Let $x_j$ be the the day of the year ($x_1$ = January 1, $x_{364}$ = December 30th, etc..).
- PIE
$P$(at least one match) = $P$(match on $x_1 \cup$ match on $x_2 \cup ... \cup $ match on $x_{365}$.
Now if we look at this with two people, is is a simple: $\textbf{p} \cdot \textbf{p}$ because person 1 can have each birthday $x_i$ with $P(x_i) = p_i$ and for there to be a match person 2 must also have the birthday on $x_i$ . Each event is disjoint, so we have $\textbf{p} \cdot \textbf{p}$.
With 3 people, it becomes a little more confusing with $P$(match on $x_i$) = ${3 \choose 2} p_i^2(1-p_i)^1 + {3\choose3} (p_i)^3$, which I arrived to by saying a birthday on $x_i$ is a success with $P(x_i) = p_i$ and continued as a binomial distribution for 2 or 3 successes.
Note: This can easily be extended that given n $n$ people, the probability 2 or more share a birthday on $x_i$ is the sum from $j=2$ to $n$ successes in a binomial distribution: $\sum_{j=2}^{n} {n \choose j}p_i^j(1-p_i)^{n-j}$
So for 3 people, we still have disjoint events and we have
$\textbf{p}_{k=3} = {{3 \choose 2} p_i^2(1-p_i)^1 + {3\choose3} (p_i)^3}_{1 \leq i \leq 365}$ and the probability of intersecting birthdays is $\textbf{p}_{k=3} \cdot \textbf{p}_{k=3}$.
Once we get to four people, the work becomes really messy as PIE has to start getting used, and honestly my math is probably either A) wrong or B) messy. So, I decided to not take PIE approach.
- Counting compliment
Again, with 2 people it is the trivial $1 - (\textbf{p} \cdot (\textbf{1 - p}))$
Immediately, with 3 people and more I realized there is a brutal tree structure where we effectively turn everything into cases, which definitely seemed like the wrong approach.
Could someone help me with what the approach might be for this question? The other post related to this did not help me very much. Thank you in advance!