12

I'm learning Python. I can't figure out why hashlib.sha512(salt + password).hexdigest() doesn't give the expected results.

I'm looking for a pure Python implementation of the equivalent of Ulrich Drepper's sha512crypt.c algorithm. (It took me a while to figure out what I was looking for.)

According to the man page for crypt on my Ubuntu 12.04 system, crypt is using SHA-512 (because the strings start with $6$).

The code below verifies that the behavior is as expected when I call Python's wrapper of the system crypt (i.e., crypt.crypt()). I want to use hashlib.sha512 or some other Python lib to produce the same result as crypt.crypt(). How?

This code shows the problem I'm encountering:

import hashlib, crypt

ctype = "6" #for sha512 (see man crypt)
salt = "qwerty"
insalt = '${}${}$'.format(ctype, salt)
password = "AMOROSO8282"

value1 = hashlib.sha512(salt + password).hexdigest() #what's wrong with this one?
value2 = crypt.crypt(password, insalt) #this one is correct on Ubuntu 12.04
if not value1 == value2:
    print("{}\n{}\n\n".format(value1, value2))

According to the crypt man page, SHA-512 is 86 chars. The crypt() call in the code above conforms to that. However, the output of hashlib.sha512 is longer than 86 chars, so something is way off between these two implmentations...

Here's the output for those who don't want to run the code:

051f606027bd42c1aae0d71d049fdaedbcfd28bad056597b3f908d22f91cbe7b29fd0cdda4b26956397b044ed75d50c11d0c3331d3cb157eecd9481c4480e455
$6$qwerty$wZZxE91RvJb4ETR0svmCb69rVCevicDV1Fw.Y9Qyg9idcZUioEoYmOzAv23wyEiNoyMLuBLGXPSQbd5ETanmq/

Another attempt based on initial feedback here. No success yet:

import hashlib, crypt, base64

ctype = "6" #for sha512 (see man crypt)
salt = "qwerty"
insalt = '${}${}$'.format(ctype, salt)
password = "AMOROSO8282"

value1 = base64.b64encode(hashlib.sha512(salt + password).digest())
value2 = crypt.crypt(password, insalt) #this one is correct
if not value1 == value2:
    print("{}\n{}\n\n".format(value1, value2))
3
  • Is your password (excluding the numbers) a word in Portuguese intentionally or was it an awkward coincidence? Just curious :)
    – JMCF125
    Commented Jan 26, 2014 at 19:02
  • 1
    This password is an actual password pulled from one of the large databases of real stolen passwords, fwiw.
    – MountainX
    Commented Jan 27, 2014 at 1:03
  • I see. Without salt, it would indeed be easy to crack, say, with a rainbow table made from a small Portuguese dictionary. BTW, +1, this is an interesting question, even though I don't use Python.
    – JMCF125
    Commented Jan 27, 2014 at 11:42

3 Answers 3

12

Here's the solution. There is also more detail at this other question: Python implementation of sha512_crypt.c where it shows that the backend of passlib contains a pure Python implementation of sha512_crypt (and the Python implementation is called if crypt.crypt() isn't available on the OS).

$ sudo pip install passlib

import passlib.hash, crypt

ctype = "6" #for sha512 (see man crypt)
salt = "qwerty"
insalt = '${}${}$'.format(ctype, salt)
password = "AMOROSO8282"

value1 = sha512_crypt.encrypt(password, salt=salt, rounds=5000)
value2 = crypt.crypt(password, insalt)
if not value1 == value2:
    print("algorithms do not match")
print("{}\n{}\n\n".format(value1, value2))

Here is the output:

$6$qwerty$wZZxE91RvJb4ETR0svmCb69rVCevicDV1Fw.Y9Qyg9idcZUioEoYmOzAv23wyEiNoyMLuBLGXPSQbd5ETanmq/
$6$qwerty$wZZxE91RvJb4ETR0svmCb69rVCevicDV1Fw.Y9Qyg9idcZUioEoYmOzAv23wyEiNoyMLuBLGXPSQbd5ETanmq/

One key point is that Passlib has a pure Python implementation of sha512_crypt that it will use when the system doesn't have the crypt implementation that current Linux systems have (e.g., http://www.akkadia.org/drepper/SHA-crypt.txt).

See the documentation for PassLib here:

passlib - password hashing library for python - Google Project Hosting
https://code.google.com/p/passlib/

Passlib 1.6.2 documentation — Passlib v1.6.2 Documentation
http://pythonhosted.org/passlib/

passlib-users - Google Groups
https://groups.google.com/forum/#!forum/passlib-users

New Application Quickstart Guide — Passlib v1.6.2 Documentation
http://pythonhosted.org/passlib/new_app_quickstart.html#sha512-crypt

passlib.hash.sha512_crypt - SHA-512 Crypt — Passlib v1.6.2 Documentation
http://pythonhosted.org/passlib/lib/passlib.hash.sha512_crypt.html#passlib.hash.sha512_crypt

8

The manual of crypt is imprecise (even misleading). The algorithms used by crypt with the “MD5”, “SHA-256” or “SHA-512” monikers are in fact algorithms built on these primitives. They are password-based key derivation functions, using the hash to perform key strengthening.

A good password hashing algorithm has two properties: it must combine the password with a unique salt (to defeat attempts to crack many passwords at once), and it must be slow (because that hurts the attacker more than the defender, since the attacker needs to try a huge number of combinations). Everyday hash algorithms like MD5 and the SHA families are designed to be fast, as fast as possible while still having the desired security properties. One way to build a password hash algorithm is to take a cryptographic hash algorithm and iterate it many times. While this isn't ideal (because there are better techniques that make it more difficult to build dedicated hardware for password cracking), it is adequate.

The Wikipedia article for crypt(3) provides a brief explanation and has pointers to primary sources. Linux and FreeBSD's man pages are poor, but Solaris's has enough information not to be misleading (follow the links to crypt.conf(4) and then crypt_sha512 and the others). You can also read Is user password in ubuntu 13.04 in plain text? and Is there repetition in the Solaris 11 hash routine? Can I add some?

The right way to compute the output of crypt in Python is to call crypt.crypt.

1
  • Thank you for your answer. It helped get me on the right track.
    – MountainX
    Commented Jan 12, 2014 at 17:51
5

Your passwords are not the same length, that is because the crypt() output is base64 encoded and you use hexdigest for value1.

Instead of hexdigest, you should try to do something like

value1 = crypt_base64(hashlib.sha512(salt + password))

with crypt_base64 like the bash implementation, final part of doHash() function.

4
  • Good suggestion, but your answer isn't sufficient. Can you provide working Python code based on my simple example? I updated my answer with base64 encoding but the two results still don't match.
    – MountainX
    Commented Jan 11, 2014 at 18:34
  • 1
    @MountainX you are wrong to think that the bash script implements "standard" b64 encoding. crypt_base64 cannot be lazily implemented by calling b64encode from base64.
    – Timo
    Commented Jan 12, 2014 at 8:07
  • Thank you for your answer. It helped get me on the right track. See my answer for a "ready to run" solution.
    – MountainX
    Commented Jan 12, 2014 at 17:52
  • @MountainX Nice that I got you on the right track. I +1-ed your answer.
    – Timo
    Commented Jan 12, 2014 at 19:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.