For a project, I need a method of creating thousands of random strings while keeping collisions low. I'm looking for them to be only 12 characters long and uppercase only. Any suggestions?
-
3You mean you don't want any lowercase digits?– martineauCommented Aug 19, 2013 at 17:02
-
Hmm, yeah, that should be clarified :)– Maarten BodewesCommented Aug 19, 2013 at 17:02
-
Don't forget to read this page about the default random number generator in python. The chance of collisions seems to be fully dependent on the size of the "random strings", but that does not mean that an attacker cannot re-create the random numbers; the random numbers generated are not cryptographically secure.– Maarten BodewesCommented Aug 19, 2013 at 17:10
-
Hah, right. I meant alphanumeric.– BrandonCommented Aug 20, 2013 at 15:08
7 Answers
CODE:
from random import choice
from string import ascii_uppercase
print(''.join(choice(ascii_uppercase) for i in range(12)))
OUTPUT:
5 examples:
QPUPZVVHUNSN
EFJACZEBYQEB
QBQJJEEOYTZY
EOJUSUEAJEEK
QWRWLIWDTDBD
EDIT:
If you need only digits, use the digits
constant instead of the ascii_uppercase
one from the string
module.
3 examples:
229945986931
867348810313
618228923380
-
4yeah, well this is missleading: "12 digits long and uppercase" -- since digits can't be uppercased Commented Aug 19, 2013 at 17:01
-
And if you need Alphanumeric i.e ASCII Uppercase plus digits then
import digits
print(''.join(choice(ascii_uppercase + digits) for i in range(12)))
Commented Jan 5, 2017 at 12:45 -
Does this gives an unique Id each time? What if I call this function from multiple threads (e.g. 2 of them) for 10000 times? What is the probability of collision or getting the same id at given point of time?– AnilJCommented Sep 6, 2017 at 22:43
-
@AnilJ for further info on how the
random
module is working, please read the official documentation on it: docs.python.org/3/library/random.html Commented Sep 7, 2017 at 7:44 -
Well, digits is not on Python3. You can use
string.hexdigits
to get a mix of '0123456789abcdefABCDEF', or juststring.digits + string.ascii_letters
for all letters.– goetzCommented Oct 31, 2017 at 1:20
By Django
, you can use get_random_string
function in django.utils.crypto
module.
get_random_string(length=12,
allowed_chars=u'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
Returns a securely generated random string.
The default length of 12 with the a-z, A-Z, 0-9 character set returns
a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
Example:
get_random_string()
u'ngccjtxvvmr9'
get_random_string(4, allowed_chars='bqDE56')
u'DDD6'
But if you don't want to have Django
, here is independent code of it:
Code:
import random
import hashlib
import time
SECRET_KEY = 'PUT A RANDOM KEY WITH 50 CHARACTERS LENGTH HERE !!'
try:
random = random.SystemRandom()
using_sysrandom = True
except NotImplementedError:
import warnings
warnings.warn('A secure pseudo-random number generator is not available '
'on your system. Falling back to Mersenne Twister.')
using_sysrandom = False
def get_random_string(length=12,
allowed_chars='abcdefghijklmnopqrstuvwxyz'
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'):
"""
Returns a securely generated random string.
The default length of 12 with the a-z, A-Z, 0-9 character set returns
a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
"""
if not using_sysrandom:
# This is ugly, and a hack, but it makes things better than
# the alternative of predictability. This re-seeds the PRNG
# using a value that is hard for an attacker to predict, every
# time a random string is required. This may change the
# properties of the chosen random sequence slightly, but this
# is better than absolute predictability.
random.seed(
hashlib.sha256(
("%s%s%s" % (
random.getstate(),
time.time(),
SECRET_KEY)).encode('utf-8')
).digest())
return ''.join(random.choice(allowed_chars) for i in range(length))
Could make a generator:
from string import ascii_uppercase
import random
from itertools import islice
def random_chars(size, chars=ascii_uppercase):
selection = iter(lambda: random.choice(chars), object())
while True:
yield ''.join(islice(selection, size))
random_gen = random_chars(12)
print next(random_gen)
# LEQIITOSJZOQ
print next(random_gen)
# PXUYJTOTHWPJ
Then just pull from the generator when they're needed... Either using next(random_gen)
when you need them, or use random_200 = list(islice(random_gen, 200))
for instance...
-
2And the advantage of using a generator for this would be? Commented Aug 19, 2013 at 17:10
-
@martineau can take one at a time, set up ones with different variables, can slice off to take n many at a time etc... The main difference is that it's in effect an iterable itself, instead of repeatedly calling a function... Commented Aug 19, 2013 at 17:12
-
-
functools.partial
can fix parameters, andlist(itertools.islice(gen, n))
isn't any better than[func() for _ in xrange(n)]
Commented Aug 19, 2013 at 17:58 -
@user2357112 by building a generator, there's an advantage over resuming its state, than setting up and calling up a function repeatedly... Also the
list
andislice
will work at the implementation level instead of as a list-comp that could leak its_
(in Py 2.x) variable and has to build an unnecessary range constraint that's otherwise handled... Also, it's also harder to build on top of functions, rather than streams... Commented Aug 19, 2013 at 18:05
#!/bin/python3
import random
import string
def f(n: int) -> str:
bytes(random.choices(string.ascii_uppercase.encode('ascii'),k=n)).decode('ascii')
run faster for very big n. avoid str concatenate.
For cryptographically strong pseudo-random bytes you might use the pyOpenSSL wrapper around OpenSSL.
It provides the bytes
function to gather a pseudo-random sequences of bytes.
from OpenSSL import rand
b = rand.bytes(7)
BTW, 12 uppercase letters is a little bit more that 56 bits of entropy. You will only to have to read 7 bytes.
This function generates random string of UPPERCASE letters with the specified length,
eg: length = 6, will generate the following random sequence pattern
YLNYVQ
import random as r
def generate_random_string(length):
random_string = ''
random_str_seq = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
for i in range(0,length):
if i % length == 0 and i != 0:
random_string += '-'
random_string += str(random_str_seq[r.randint(0, len(random_str_seq) - 1)])
return random_string
-
With above code
random_str_seq = "ABC@#$%^!&_+|*()OPQRSTUVWXYZ"
can give you even more complex results.– Iqra.Commented Jan 18, 2019 at 11:39
A random generator function without duplicates using a set
to store values which have been generated before. Note this will cost some memory with very large strings or amounts and it probably will slow down a bit. The generator
will stop at a given amount or when the maximum possible combinations are reached.
Code:
#!/usr/bin/env python
from typing import Generator
from random import SystemRandom as RND
from string import ascii_uppercase, digits
def string_generator(size: int = 1, amount: int = 1) -> Generator[str, None, None]:
"""
Return x random strings of a fixed length.
:param size: string length, defaults to 1
:type size: int, optional
:param amount: amount of random strings to generate, defaults to 1
:type amount: int, optional
:yield: Yield composed random string if unique
:rtype: Generator[str, None, None]
"""
CHARS = list(ascii_uppercase + digits)
LIMIT = len(CHARS) ** size
count, check, string = 0, set(), ''
while LIMIT > count < amount:
string = ''.join(RND().choices(CHARS, k=size))
if string not in check:
check.add(string)
yield string
count += 1
for my_count, my_string in enumerate(string_generator(12, 20)):
print(my_count, my_string)
Output:
0 IESUASWBRHPD
1 JGGO1THKLC9K
2 BW04A5GWBA7K
3 KDQTY72BV1S9
4 FAOL5L28VVMN
5 NLDNNBGHTRTI
6 2RV6TE6BCQ8K
7 B79B8FBPUD07
8 89VXXRHPUN41
9 DFC8QJUY6HRB
10 FXYYDKVQHC5Z
11 57KTZE67RSCU
12 389H1UT7N6CI
13 AKZMN9XITAVB
14 6T9ACH3GDAYG
15 CH8RJUQMTMBE
16 SPQ7E02ZLFD3
17 YD6JFXGIF3YF
18 ZUSA2X6OVNCN
19 JQRH6LR229Y4