0
$\begingroup$

Haven't done probabilities like this in a while, and i can't seem to find a convincing answer.

Basically, i'd like to know what is the probability of having randomly picked (at least once) 800 objects out of a 1000 object list after N attempts.

  • The objects are all different.
  • Once picked, the objects go back in the list (all picks are independent)

I have found this formula, but it doesnt seem to work in excel : First formula on this page

I would appreciate any help and excel implementaiton !

Thanks !

PS: sorry i do not know the exact concept words in english for this problem 1

$\endgroup$
5
  • $\begingroup$ This is related to the Coupon Collector's Problem. Each ball that was picked at least once corresponds to one type of coupon that the coupon collector has collected. There is an explicit formula for your probability in the answer to math.stackexchange.com/questions/379525/… $\endgroup$
    – David K
    Commented Mar 31, 2017 at 20:31
  • $\begingroup$ you are picking with replacement or without replacement? Observe that $K\ge 0.8 M$, otherwise the probability is zero. $\endgroup$
    – Masacroso
    Commented Mar 31, 2017 at 20:31
  • $\begingroup$ Looks like selection with replacement to me. Getting this into Excel looks like it might be difficult, however. $\endgroup$
    – David K
    Commented Mar 31, 2017 at 20:33
  • $\begingroup$ If you must do this in Excel, I recommend devoting an entire sheet to just calculate this one probability. That way you can sum up a large number of terms by computing them in separate cells and then using =SUM(range). $\endgroup$
    – David K
    Commented Mar 31, 2017 at 20:38
  • $\begingroup$ i actually tried this, but it must be doing something wrong. For example a simpler example: trying to pick at least 15 items out of like 50 in 200 tries (should be pretty high!) The i/m ^200 number is always tiny, and summing it 15 times just doesnt make it any bigger. So it doesnt add up. $\endgroup$
    – Malcoolm
    Commented Apr 1, 2017 at 1:47

1 Answer 1

0
$\begingroup$

In Excel I would make the columns represent the number of different items seen and the rows represent the number of draws. Each cell will have the probability of that number of items seen given the number of draws. You start with $1$ in the cell for one draw and one item and $0$ for all other numbers of items. In each cell you have (up)($\frac {seen}{1000}$) + (up-left)$\frac {999-seen}{1000}$. You can get seen with a fixed-free reference to the top of the column. The idea is that with one more draw you get an old item with probability $\frac {seen}{1000}$. The seen=1 column does not have the up-left term. Copy right/copy down and your spreadsheet is full. Then you can have a sum of the columns from $800$ seen to $1000$ seen to get the probability of having seen at least $800$. Modeling it as a Poisson distribution you should start to have good chance of seeing $800$ different around $-1000\log(0.2)\approx 1600$ draws. We want to get the chance a particular object has not been seen down to $0.2$ and the Poisson distribution says that means $e^{-\lambda}=0.2$ where $\lambda$ is the expected number of times the item is seen, or draws/1000.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .