1
$\begingroup$

According to Wikipedia: https://en.wikipedia.org/wiki/Negative_binomial_distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of successes in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of failures (denoted r) occurs.

I see that the Negative Binomial distribution is usually used to model count data, especially in the insurance industry. However, I don't see why it should be used when it models number of success before some failures occur. For example, it is used to model the number of catastrophic events happening in 1 year and I don't see anything to do with "number of success before some failures".

Could you please explain me why we use Negative Binomial distribution to model count data, even when the concept of "number of success before $r$ failures" doesn't exist ?

Thank you very much for your help!

$\endgroup$
1
  • 1
    $\begingroup$ There is more than one formulation of the negative binomial, but in the case you quote and another, it takes values in the non-negative integers, and so can easily represent a count. So too do the geometric and Poisson distributions, but they have some fixed properties on the relationship between the mean and variance while the negative binomial has more flexibility, which may be useful for some modelling $\endgroup$
    – Henry
    Commented Jan 23, 2022 at 17:40

1 Answer 1

1
$\begingroup$

The better motivation for the use of the negative binomial for count data is that the negative binomial is a gamma mixture of poisson random variables. To see this, suppose that $y|\lambda \sim \text{Poisson}(\lambda)$, i.e. given a fixed value of $\lambda$, $y$ is poisson distributed. Further, assume that $\lambda \sim \text{Gamma}(a,b)$. Then you can show that the marginal distribution of $y$ has the negative binomial distribution, i.e. by solving:

$$ p(y) = \int p(y|\lambda) p(\lambda) d\lambda $$

The most basic approach to count data is to assume that $y$ is poisson with a fixed $\lambda$. This can however fail to capture certain aspects of count data, since it assums that the mean and the variance of $y$ are equal. Whereas in the negative binomial case we no longer have this issue

$\endgroup$
2
  • $\begingroup$ Hi, thank you very much for your answer. I would like to ask if there is an equivalent between the "success failure" formulation and the Poisson-Gamma mixture of Negative Binomial ? There is an explanation in Wikipedia but I found it not easy to understand, so if you can suggest a formal paper, I am very appreciated!! Thank you for your help! $\endgroup$ Commented Jan 23, 2022 at 17:51
  • $\begingroup$ the wikipedia discussion is not something i've come across before, and to be honest does not seem very useful. I would rather view this approach as a new definition of the negative binomial, and it makes sense in the context of count regression since it generalizes the poisson regression to allow for $\lambda$ to vary $\endgroup$ Commented Jan 23, 2022 at 18:22

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .