1
$\begingroup$

I would like to sample 6 quantities that are guaranteed to add up to 600, each with a mean of 100. I want to be control the amount of variance around 100 (same variance for all 6 quantities, but need ability to crank up/down), so the multinomial distribution won't do (can only specify probability). I can kind of do so with the Dirichlet distribution (R code below):

library(MCMCprecision)
sims <- 
  rdirichlet(1e5, a=rep(100,6))*600

However, there is an issue. I want those quantities to be discrete (rounded to the nearest number), not continuous. If I round them, sometimes the sum adds up to something other than 600 (e.g., 599 or 601).

Is there a way to achieve something similar, but only sample discrete numbers, rather than the whole line? I was thinking of something like how the geometric distribution is a discrete counterpart to the exponential, but not sure if there is any equivalent thing here.

One thing I thought of was to just filter and only keep the samples that sum to 600:

sims <- 
  round(rdirichlet(1e5, a=rep(100,6))*600)
hist(sims[rowSums(sims)==600,])

But not sure if that somehow biases things. Visually it looks similar to the unfiltered distribution.

$\endgroup$
3
  • $\begingroup$ Interesting question. This is related to the stars and bars technique in combinatorics: you want to randomly place 5 bars among 600 stars, with prespecified expectation and variance on the bin contents. In the end, it comes down to specifying an appropriate PMF on a discrete distribution with ${605 \choose 5}$ possible outcomes. $\endgroup$ Commented Feb 1 at 9:26
  • $\begingroup$ Within rather broad limits, employing a Dirichlet distribution and discretizing its results ought to be an excellent approximation to what you seek. This will break down only when the variances are less than about $10$ (corresponding to single-digit variation in the values). $\endgroup$
    – whuber
    Commented Feb 1 at 15:36
  • $\begingroup$ Ok, both makes sense, thanks! I will stick with the discretized Dirichlet approach, filtering out the cases where the discretized sum is not 600. $\endgroup$ Commented Feb 2 at 7:14

0

Browse other questions tagged or ask your own question.