2
$\begingroup$

I've been thinking about ways to compress the output from a (supposedly) random number generator. Let's assume for a moment that my computer can produce high-quality random numbers. I'm certainly not an expert in this field, so please correct me wherever.

Let's say I need random numbers between zero and 199 inclusive, however I can only read a minimum of a byte at a time from the RNG, so I use some compression function to reduce the 256 possible values of the byte to 200 different values. I'm considering using the modulo operation as a compression function, in that the modulo operation will cause a wraparound of the value should it exceed 199.

I think I've immediately spotted a flaw with using modulo in that it's twice as likely for the values 0-55 to occur. Would you say this assessment of the modulo operation is correct, or is there something about the properties of random numbers (entropy or whatever) that means that this doesn't matter? Also, if not modulo, could you suggest a good method of reducing the number of possible values of a RNG which effectively preserves their 'randomness'?

$\endgroup$
5
  • $\begingroup$ Well, if you have a random variable $\xi$ with a distribution $\mu_\xi$ and you apply some function on $\xi$, e.g. $$ f(\xi) = \xi\mod{200} $$ then the distribution of $f(\xi)$ is in general different from $\xi$, and is called an image distribution (under the map $f$). As you correctly noticed, in case $\xi$ is uniformly distributed over $[0;255]$, any number from $[0;55]$ is 2 times more probable than any one from $[56;199]$. $\endgroup$
    – SBF
    Commented Apr 4, 2013 at 12:10
  • 1
    $\begingroup$ One method is to throw away 56 of the possible byte values. The remaining 200 will be random (well, as random as the RNG allows). On average, you'll have to produce $1.28$ bytes to get one you can use. $\endgroup$ Commented Apr 4, 2013 at 12:25
  • $\begingroup$ @GerryMyerson Would dividing the output of the RNG by 1.28 and ensuring that the result was rounded to the nearest integer have this effect? $\endgroup$
    – Doddy
    Commented Apr 4, 2013 at 12:35
  • $\begingroup$ No! That would make some of the 200 values twice as likely as others. $\endgroup$ Commented Apr 5, 2013 at 12:03
  • $\begingroup$ See this answer math.stackexchange.com/a/185606/312 $\endgroup$
    – leonbloy
    Commented May 25, 2016 at 22:44

1 Answer 1

2
$\begingroup$

So if I interpret your question correctly, you've got an uniformly distributed random natural number in the range $[0,256)$ and want to obtain an uniformly random number in the range $[0,200)$.

As you've observed, just taking the number modulo $200$ doesn't work, since the numbers below $56$ will be obtained twice as often that way.

The simplest way to do it is to simply throw away anything $\ge 200$ and try again. If you consider that too wasteful, you can also store the excess value and use it for the next try. So a less wasteful algorithm would be (where drawn numbers are always single-byte numbers):

  • First, you draw a random number $r_1$.
  • If $r_1 < 200$, you just return it.
  • Otherwise you draw a second number $r_2$ and calculate $s = 256*(r_1 - 200) + r_2$. Now $s$ is uniformly distributed in the integers of the range $[0,14336)$.
  • If $s < 14200$, you return $s \bmod 200$. Otherwise, you start over.

As you see, the probability of not succeeding after the second draw is extremely low. You could, of course, repeat this process instead of starting over (the remainder here is in the range $[0,136)$ so the chance in the third try is even larger).

Also, when returning a number from the second step, you're wasting the quotient part (in the range $[0,72)$; you might want to save that for later trials.

$\endgroup$
1
  • $\begingroup$ There are some optimizations for this. For example, calculate the odd number $s$ where $r = s * 2^n$, and use the first $n$ bits as $u$, then use your algorithm to calculate $R(s)$; finally, return with $u * s + R(s)$. Calculating $R(s)$ instead of $R(r)$ can result in a significant performance boost. $\endgroup$ Commented Feb 13, 2020 at 22:20

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .