51
\$\begingroup\$

While there are many code golf questions on here involving randomness, I haven't seen one yet that actually asks for building an algorithmic pseudorandom number generator. There's this one that asks you to generate a bit stream, but the randomness tests provided on that one were not very rigorous, and it's not code-golf.

The program you write will have a single callable function that will return a random integer from 0 to 4294967295. This function must not call upon any libraries or other functions that were not also written as part of the program, especially calls to /dev/random or a language's built-in rand() library. More specifically, you are limited to the basic operators of the language you are working in, such as arithmetic, array access, and conditional flow control statements.

The score of your program is calculated as follows:

Score = C / R

Where C is the length of the code in characters, and R is the number of Diehard tests that your generator passes (If your random number generator does not pass at least one Diehard test, its score is infinity and it is disqualified). Your generator passes a Diehard test if the file it generates provides a range of P-values that appear to be distributed uniformly along the interval [0, 1).

To calculate R, use your random number generator with its default seed to generate a 16 MB binary data file. Each call of the function returns four bytes; if your function is too slow to return bytes, this will factor a trade-off into attaining a low score by how difficult it is to test. Then, run it through the Diehard tests and check the P-values provided. (Do not try and implement these yourself; use the ones provided here)

Lowest score wins, of course.

\$\endgroup\$
6
  • \$\begingroup\$ Is code that requires internet connectivity allowed? (not gonna access any random function online, but maybe ping or the values of an api call) \$\endgroup\$
    – elssar
    Commented Jan 25, 2013 at 16:47
  • 1
    \$\begingroup\$ "This function must not call upon any libraries or other functions that were not also written as part of the program." That includes Internet connectivity functions. Your generation should be purely algorithmic. \$\endgroup\$
    – Joe Z.
    Commented Jan 25, 2013 at 17:27
  • \$\begingroup\$ The diehard suite expects input files of 10-11 MB. \$\endgroup\$
    – primo
    Commented Jan 25, 2013 at 18:44
  • \$\begingroup\$ The link to the tests seems to be broken, here's a possible alternative. \$\endgroup\$ Commented Jan 30, 2016 at 6:45
  • \$\begingroup\$ How should do this for my brain-flak answer (removed below)? I think the code is too slow to be practical \$\endgroup\$
    – user63187
    Commented Dec 20, 2017 at 0:48

11 Answers 11

23
\$\begingroup\$

Perl 28 / 13 ≈ 2.15

sub r{$s^=~($s^=$s/7215)<<8}

log file here

Perl 29 / 13 ≈ 2.23

sub r{$s^=~($s^=$s<<8)/60757}

log file here

These are something of a variation on a Xorshift, using floating point division instead of a right shift. They both pass 13 of 15 tests, failing only tests 6 and 7.

I'm not exactly sure how long the cycle is, but because the following code doesn't terminate in any short period of time, it's likely the full 232:

$start = r();
$i++ while $start != r();
print $i;

Perl 39 / 10 = 3.9

$s=$^T;sub r{~($s=$s*$s%4294969373)||r}

Note: if you're looking for a Blum-Blum-Shub-esque PRNG, Keith Randall's solution is far better than either of these.

As with my original solution below, this is also an implementation of the Blum Blum Shub, with one major difference. I uses a modulus slightly larger than 232 (M = 50971 • 84263), and whenever a value is encountered that it is not a valid 32-bit integer (that is, larger than 232), it returns the next value in the rotation instead. In essense, these values are pruned out, leaving the rest of the rotation undisturbed, resulting in a nearly uniform distribution.

It seems to have helped. In addition to passing the same 9 tests as before, it now also convincingly passes the Minimum Distance test. A sample log file can be found here.


Perl 33 / 9 ≈ 3.67 (Invalid?)

 $s=$^T;sub r{$s=$s*$s%4294951589}

Note: this solution might be considered invalid, as the top-most 0.00037% of the range will never be observed.

A quick and dirty implementation of the Blum Blum Shub. I claim the following results:

 1. passed - Birthday Spacings
 2. FAILED - Overlapping Permutations
 3. passed - Ranks of 31x31 and 32x32 Matrices
 4. passed - Ranks of 6x8 Matrices
 5. FAILED - Monkey Tests on 20-bit Words
 6. FAILED - Monkey Tests OPSO, OQSO, DNA
 7. FAILED - Count the 1s in a Stream of Bytes
 8. passed - Count the 1s for Specific Bytes
 9. passed - Parking Lot Test
10. FAILED - Minimum Distance Test
11. passed - Random Spheres Test
12. FAILED - The Squeeze Test
13. passed - Overlapping Sums Test
14. passed - Runs Test
15. passed - The Craps Test

A sample log file can be found here, feel free to dispute any of the results. The file for diehard can be generated in the following manner:

print pack('N', r()) for 1..4194304

and then piping the output into a file. Minimum Distance looks like it might have passed, but if you run it multiple times it's always very close to 1.0, which indicates failure.


Details

In general, the Blum Blum Shub is a terrible PRNG, but it's performance can be improved by choosing a good modulus. The M I've chosen is 7027 • 611207. Both of these prime factors, p and q, have modular residue 3 (mod 4), and gcd(φ(p-1), φ(q-1)) = 2, which is as low as it can be.

Although these are the only criteria listed on the wiki page, it doesn't seem to be enough. Almost all of the modulo I tried failed every test. But there's a handful that will pass some of the tests, and the one I've chosen seems to be exceptionally good, for whatever reason.

As a final note, Test 5 on its own seems to be a fairly good indicator of how good the PRNG is. If it doesn't almost pass Test 5, it will fail the rest of them spectacularly.


BONUS: Perl 62 / 14 ≈ 4.43

$t=$^T;sub r{$t|=(($s=$s/2|$t%2<<31)^($t/=2))<<31for 1..37;$t}

Just for geekery, this is a 32-bit version of the PRNG used in the original Tetris for NES. Amazingly, it passes 14 of the 15 tests!

 1. passed - Birthday Spacings
 2. passed - Overlapping Permutations
 3. passed - Ranks of 31x31 and 32x32 Matrices
 4. passed - Ranks for 6x8 Matrices
 5. passed - Monkey Tests on 20-bit Words
 6. passed - Monkey Tests OPSO, OQSO, DNA
 7. FAILED - Count the 1s in a Stream of Bytes
 8. passed - Count the 1s for Specific Bytes
 9. passed - Parking Lot Test
10. passed - Minimum Distance Test
11. passed - Random Spheres Test
12. passed - The Squeeze Test
13. passed - Overlapping Sums Test
14. passed - Runs Test
15. passed - The Craps Test

Sample log file can before here.

Admittedly, the 1..37 bit isn't an exact transcription. In the orginal version, the entropy routine is updated 60 times per second, and then queried at random intervals, largely dependent on user input. For anyone who cares to disassemble the ROM, the entropy routine begins at 0xAB47.

Python-style pseudo-code:

carry = entropy_1 & 1
entropy_1 >>= 1
entropy_2 = (entropy_2 >> 1) | (carry << 31)
carry = (entropy_1 & 1) ^ (entropy_2 & 1)
entropy_1 |= carry << 31
\$\endgroup\$
3
  • \$\begingroup\$ Yeah, I noticed that your algorithm "failed" the bitstream test, but actually had a few values below 0.999999. Still, your tests seem accurate. \$\endgroup\$
    – Joe Z.
    Commented Jan 25, 2013 at 22:56
  • \$\begingroup\$ There is one problem, though, and that is that numbers from 4294951589 to 4294967295 have no chance of occurring (although I suppose that is part of the reason it failed some of the Diehard tests). \$\endgroup\$
    – Joe Z.
    Commented Jan 25, 2013 at 23:20
  • 1
    \$\begingroup\$ @JoeZeng yes, that is a problem. It's most evident in Test 5: the first run has 151k missing words, and the rest of them only have 143k missing. One solution would be to choose a modulus slightly larger than 2^32, and allow the values that are too large to wrap around to zero, but I wasn't able to find one that works well. If I do, I'll update the post. \$\endgroup\$
    – primo
    Commented Jan 26, 2013 at 7:15
8
\$\begingroup\$

Python, 46 / 15 = 3.0666

v=3
def R():global v;v=v**3%(2**32-5);return v

Uses modular exponentiation to generate randomness. 2**32-5 is the largest prime less than 2^32. (Same deal with not being able to run test #2.)

\$\endgroup\$
5
  • \$\begingroup\$ Could you paste a log file? \$\endgroup\$
    – primo
    Commented Jan 28, 2013 at 18:42
  • \$\begingroup\$ Log here: codepad.org/ZWhoGe0t \$\endgroup\$ Commented Jan 29, 2013 at 18:43
  • 1
    \$\begingroup\$ Stupid Windows. It was converting all occurances of \r and \n to \r\n, which obviously skews the results. A fix is to write the file directly using f = open('file.bin', 'wb') and f.write. \$\endgroup\$
    – primo
    Commented Jan 30, 2013 at 1:10
  • \$\begingroup\$ This new score undercuts the previous score, so is now the accepted answer. \$\endgroup\$
    – Joe Z.
    Commented Feb 1, 2013 at 3:24
  • \$\begingroup\$ This new score was undercut yet again, so I have changed the accepted answer. \$\endgroup\$
    – Joe Z.
    Commented Feb 9, 2013 at 17:01
8
\$\begingroup\$

Mathematica, 32 / 15 = 2.133

x=3;Mod[x=Mod[x^2,28!-67],2^32]&

A straightforward implementation of BBS.

Binary file generated with:

f = %; (* assigns anonymous function declared in the previous expression to f *)
Export["random.bin", Array[f, 2^22], "UnsignedInteger32"];

Summary of results:

 1. BIRTHDAY SPACINGS TEST           .684805
 2. OVERLAPPING 5-PERMUTATION TEST   .757608/.455899
 3. BINARY RANK TEST                 .369264/.634256
 4. BINARY RANK TEST                 .838396
 5. THE BITSTREAM TEST                (no summary p-value)    
 6. OPSO, OQSO and DNA                (no summary p-value)
 7. COUNT-THE-1's TEST               .649382/.831761
 8. COUNT-THE-1's TEST                (no summary p-value)
 9. PARKING LOT TEST                 .266079
10. MINIMUM DISTANCE TEST            .493300
11. 3DSPHERES TEST                   .492809
12. SQEEZE                           .701241
13. OVERLAPPING SUMS test            .274531
14. RUNS test                        .074944/.396186/.825835/.742302
15. CRAPS TEST                       .403090/.403088/.277389

Full random.bin here.

Full log file here.

\$\endgroup\$
4
  • \$\begingroup\$ 28!-67 is somewhat prohibitive. Is there a smaller value that would fit in a 64-bit integer? \$\endgroup\$
    – primo
    Commented Jan 30, 2016 at 6:36
  • \$\begingroup\$ @primo Like Python, Mathematica's integers are arbitrary-precision by default, so it doesn't cause a problem. \$\endgroup\$ Commented Jan 30, 2016 at 6:43
  • \$\begingroup\$ I was thinking specifically for portability into C. \$\endgroup\$
    – primo
    Commented Jan 30, 2016 at 6:44
  • 1
    \$\begingroup\$ @primo gmplib.org \$\endgroup\$ Commented Jan 30, 2016 at 6:46
5
\$\begingroup\$

Ruby, 32/15 = 2.1333

This is Keith Randall's solution, implemented in Ruby.

$v=3;def R;$v=$v**3%(2**32-5)end
\$\endgroup\$
1
  • 1
    \$\begingroup\$ @JoeZ This seems to be the new lowest answer, tied with the brand-new Mathematica answer. \$\endgroup\$
    – Riking
    Commented Jan 19, 2016 at 20:17
4
\$\begingroup\$

ECMAScript 6, 31 30 25 / 15 = 1.66

This is Keith Randall's solution implemented in ES6:

e=3;R=_=>e=e**3%(2**32-5)
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Welcome to code golf! Nice first answer \$\endgroup\$
    – mousetail
    Commented May 1, 2023 at 17:05
  • 1
    \$\begingroup\$ Nice answer! I think you might be able to leave off the last semicolon. (I've seen a lot of answers on this site do that as well) \$\endgroup\$
    – The Thonnu
    Commented May 1, 2023 at 17:08
3
\$\begingroup\$

Python, 41 / 15 = 2.73333

v=0
def R():global v;v=hash(`v`);return v

Kinda cheating using the built-in hash function, but it is built-in, so no more cheating than using other builtins, like len. On the flip side, it pains me to have to pay for the global v; statement...

Passes all the Diehard tests (I had a problem with test #2, it SEGVs on my OSX machine. For my score, I'm assuming it will pass).

Here's a driver to generate the 16MB file:

import sys
for i in xrange(1<<22):
  r=R()
  sys.stdout.write('%c%c%c%c'%(r&255, r>>8&255, r>>16&255, r>>24&255))
\$\endgroup\$
9
  • \$\begingroup\$ "This function must not call upon any libraries or other functions that were not also written as part of the program, especially calls to /dev/random or a language's built-in rand() library." I'm sorry, but that disqualifies your entry. \$\endgroup\$
    – Joe Z.
    Commented Jan 27, 2013 at 0:13
  • \$\begingroup\$ To be clear, "len" would also disqualify your entry. \$\endgroup\$
    – Joe Z.
    Commented Jan 27, 2013 at 0:15
  • \$\begingroup\$ Where do you draw the line? Is + a built-in function, and hence disqualified? \$\endgroup\$ Commented Jan 27, 2013 at 1:35
  • 7
    \$\begingroup\$ But in lots of languages, operators and functions are identical. See + and __add__ in python, or operator overloading in c++. I know I'm kind of splitting hairs, so consider this example. In python can I create a map like this: {'a':5}? You'll probably say yes, but then consider that under the covers, hash('a') gets called when you do that. \$\endgroup\$ Commented Jan 27, 2013 at 3:57
  • 2
    \$\begingroup\$ I suppose I'd draw the line when you need to syntactically refer to the function in that way. If you can find a hack in Python that will allow you to directly access the map address without syntactically referring to the "hash" function, I might accept it. \$\endgroup\$
    – Joe Z.
    Commented Jan 27, 2013 at 5:03
3
\$\begingroup\$

C# 144 / 15 = 9.6

uint a=15,b=26,y;uint q(int n){y=(a*1414549U+876619U)^(b*889453U+344753U);b=a;a=y>>12;return(a%256)<<n;}uint r(){return q(24)|q(16)|q(8)|q(0);}

This passed all of the tests.

With not too many more characters it passes TestU01.

Result: http://codepad.org/iny6usjV

    uint a = 15;
    uint b = 26;

    byte prng8()
    {
        uint y = ((a * 1414549U + 876619U) ^ (b * 889453U + 344753U)) >> 12;
        b = a;
        a = y;
        return (byte)y;
    }

    uint prng32()
    {
        return ((uint)prng8() << 24) | ((uint)prng8() << 16) | ((uint)prng8() << 8) | (uint)prng8();
    }
\$\endgroup\$
0
3
\$\begingroup\$

C, 38/15 = 2.533

long long x;f(){return(x+=x*x+9)>>32;}

I couldn't get the Diehard tests working on my machine, but it passes the PractRand suite for up to 8GB of output so I assume it would pass them all.

\$\endgroup\$
2
\$\begingroup\$

C# - 103 / 14 = 7.36

double j=999;uint N(){uint i=0,n=0;for(;i++<4;n=n*256+(uint)j%256)for(j/=277;j<100000;j*=j);return n;}

Results

Passes all except test #6
See results at http://codepad.org/k1NSoyQW

Explanation

C# just can't compete with Ruby and Python for terseness, as usual, but I enjoyed trying. There are certainly other values that will work just as well (i.e., the initial value for j=999, and the divisor=277). I picked these after brief experimentation.

With file-creation wrapper

class R
{
    public static void Main(string[] args)
    {
        var r = new R();
        using (var f = new System.IO.FileStream(".\\out.bin", System.IO.FileMode.Create, System.IO.FileAccess.Write, System.IO.FileShare.Read))
        using (var b = new System.IO.BinaryWriter(f))
        {
            for (long i = 0; i < 12 * 1024 * 1024; i += 4)
            {

                b.Write(r.N());
            }
        }
    }

    double j = 999;

    uint N()
    {
        uint i = 0, n = 0;
        for (; i++ < 4; n = n * 256 + (uint)j % 256)
            for (j /= 277; j < 100000; j *= j) ;
        return n;
    }

}
\$\endgroup\$
0
\$\begingroup\$

Brain-Flak, 344/(pending)

<>((()()){})<> push the amount of iterations to do for the PRNG
(((((((((((((((((((((((((((((((((((()()()){}()){})){}{}){()()()()({}[()])}{})){}{})){}{})()){}{})()){}{})){}{})){}{}){}())){}{})){}{})()){}{})()){}{})){}{})){}{})()){}{})()){}{}) push M (one of the values for the Blum Blum Shub PRNG
((((((((((((()()()){}){}){})){}{}){()({}[()])}{}){}())){}{})()){}{}) push s see above
<>{({}[()])<>starts the loop
(({({})({}[()])}{}) squares the current number
(<>))<>{(({})){({}[()])<>}{}}{}<>([{}()]({}))mods by M
<>}{}<>loop ends

Try it online!

This works fine but the diehard tests links are all broken :( so until we get new ones i do not have a final score

This uses the Blum Blum Shub PRNG so it should pass most the cases. The numbers used are large enough no patterns will appear within the 16 MB of test cases

\$\endgroup\$
2
  • \$\begingroup\$ if this is invalid just tell me \$\endgroup\$
    – user63187
    Commented Dec 20, 2017 at 0:50
  • 1
    \$\begingroup\$ I count 344. Theorem: No fully golfed Brain-flak program have an odd number of bytes. \$\endgroup\$
    – DELETE_ME
    Commented Dec 20, 2017 at 6:50
0
\$\begingroup\$

Objective-C, 40/1 = 40

Pretty clever approach, exploiting .hash to somewhat cheat here, but I like it

for(int v=9;v=@(v).hash;printf("%i",v));
\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.