First off, I'm not a mathematician, so if I'm going about this in the wrong way, please let me know.
The problem
I have ~4.1 million timestamps (with second accuracy) when people logged into a website. They were collected over a period of ~50.5 million seconds. Some of these records are erroneous (one says that 7044 people logged in at the same second). Assuming an even distribution of logins, find n where the probability of n people logging in simultaneously < 50%
Current approach
This seems like the birthday problem could be adapted here. Given a "year" of 50.5M days, and 4.1M people the chance that two people will not share a birthday is negligible.
I found this post with a formula for triplets (three simultaneous birthdays), but is there a way to generalize it for n simultaneous birthdays? I'm guessing it's not as simple as changing the 3 in the equation to an n.