4

This is a spin-off of one of my earlier questions

Problem statement: Given a number N and an arbitrary (but non-empty) set/string/list of characters E, return a random string of length N made up of the characters in E.

What is the most pythonic way of doing this? I can go with ''.join(( random.choice(E) for i in xrange(N) )), but I'm looking for a better way of doing this. Is there a built-in function in random or perhaps itertools that can do this?

Bonus points if:

  1. Fewer function calls
  2. Fitting into one line
  3. Better generalizability to any N and E
  4. Better run-time performance

PS: This question is really just me being a Python connoisseur (if I may call myself that) and trying to find elegant and artistic ways of writing code. The reason I mention this is because it looks a little bit like homework and I want to assure the SO community that it is not.

3

2 Answers 2

9
''.join(random.sample(E*N, N))

although that won't work with sets, come to think of it. But frankly,

''.join(random.choice(E) for i in xrange(N))

is already pretty Pythonic -- it's simple, clear, and expressive.

The pythonicness that needs hours of thought is not the true pythonicness.

4

Your solution already looks pretty good, but here are some alternatives for the sake of completeness:

''.join(map(random.choice, [E]*N))

Or with itertools:

from itertools import repeat
''.join(map(random.choice, repeat(E, N)))

If you are on Python 2.x itertools.imap() would be more efficient than map() since it will not create the full list in memory.

Here is some interesting timing data (tested on Python 2.6):

>>> import timeit
>>> t1 = timeit.Timer("''.join(random.choice('abcdefghijkl') for i in xrange(3))", "import random")
>>> t2 = timeit.Timer("''.join(map(random.choice, ['abcdefghijkl']*3))", "import random")
>>> t3 = timeit.Timer("''.join(map(random.choice, repeat('abcdefghijkl', 3)))", "import random; from itertools import repeat")
>>> t4 = timeit.Timer("''.join(random.sample('abcdefghijkl'*3, 3))", "import random")
>>> t1.timeit(1000000)   # (random.choice(E) for i in xrange(N))  - OP
7.0744400024414062
>>> t2.timeit(1000000)   # map(random.choice, [E]*N)              - F.J
4.3570120334625244
>>> t3.timeit(1000000)   # map(random.choice, repeat(E, N))       - F.J
5.9411048889160156
>>> t4.timeit(1000000)   # random.sample(E*N, N)                  - DSM
6.9877378940582275

Apparently map(random.choice, [E]*N) is the winner, at least for small E and N.

1
  • Interesting. I think this data suggests that the OP's solution is the best. It's the most obvious and within a factor of two of the speed of the best-performing.
    – DSM
    Commented Jan 19, 2012 at 0:39

Not the answer you're looking for? Browse other questions tagged or ask your own question.