16

I'm currently writing my own little password manager that stores the key in a SHA256 hash, with salt. I create the hash by doing the following:

def sha256_rounds(raw, rounds=100001):
    obj = hashlib.sha256()
    for _ in xrange(rounds):
        obj.update(raw)
        raw = obj.digest()
    return obj.digest()

After it is created it is stored with the following:

    key = base64.urlsafe_b64encode(provided_key)
    length = len(key)
    with open(key_file, "a+") as key_:
        front_salt, back_salt = os.urandom(16), os.urandom(16)
        key_.write("{}{}{}:{}".format(front_salt, key, back_salt, length))

The questions/concerns I have are:

  • Is this an acceptable way to store a hashed key?
  • Should I use more iterations?
  • Is my salting technique sufficient? Or should I use a different salting technique?

If this is not an acceptable take on storing a hashed password/key, what other steps can I take to make this more secure?


UPDATE:

I took a lot of your guy's advice, and if you would like to see the outcome, so far, of my little password manager you can find it here. Everyone's advice is much appreciated and I will continue to strive to make this as excellent as possible. (if that link doesn't work use this one https://github.com/Ekultek/letmein)

1
  • Comments are not for extended discussion; this conversation has been moved to chat.
    – Rory Alsop
    Commented Jun 30, 2018 at 9:33

5 Answers 5

82

I'm currently writing my own little password manager

That's your first mistake. Something this complex has many subtle pitfalls that even experts sometimes fall into, without plenty of experience in this area you don't have a chance making something even close to secure.

stores the key in an SHA256 hash

Uh oh...

This doesn't necessarily indicate you're doing something wrong, but I have strong doubts that you're going to do it right. I assume you're talking about a master password being hashed here? The master password should be turned into a key using a KDF like PBKDF2, bcrypt, or Argon2, then this key is used to encrypt the stored passwords.

If you want to have a way to verify that the password is correct, storing a hash of the key should be fine, but you MUST NOT store the key itself...if you store the key anyone who gets access to your storage has everything they need to decrypt all the passwords!

If you aren't talking about hashing a master password and you do mean an actual randomly generated key then I have no idea what you're trying to accomplish here, but you shouldn't be using a slow KDF with a large number of iterations.

Alternatively you could be hashing the master password twice, once to store as a hash to later verify that the password the user enters is correct, and again to use as a key for encryption. Depending on how this is done it could range from a design flaw to completely giving away the key.

Edit: After seeing the full code it seems to be a fourth option: you store a hash of the password to later check if the entered password is correct, then you hash this hash to use as the key, which is nearly as bad as just storing the key itself.

I create the hash by doing the following:

def sha256_rounds(raw, rounds=100001):
    obj = hashlib.sha256()
    for _ in xrange(rounds):
        obj.update(raw)
        raw = obj.digest()
    return obj.digest()

It's not really clear what raw is here, but I'm assuming it's the password. What you're doing is an unsalted hash using SHA256. Don't try to create your own KDF!

After it is created it is stored with the following:

key = base64.urlsafe_b64encode(provided_key)
length = len(key)
with open(key_file, "a+") as key_:
    front_salt, back_salt = os.urandom(16), os.urandom(16)
    key_.write("{}{}{}:{}".format(front_salt, key, back_salt, length))

So, you're creating the key by hashing the password, then adding a random salt to the front and back? Not only is concatenating 2 different salts to the front and back non-standard, it's not accomplishing anything here because it's done after the KDF has already finished! You're just adding in some random values for the sake of having them there.


To show just how bad this is (as of commit 609fdb5ce976c7e5aa1832670505da60012b73bc), all it takes to dump all stored passwords without requiring any master password is this:

from encryption.aes_encryption import AESCipher
from lib.settings import store_key, MAIN_DIR, DATABASE_FILE, display_formatted_list_output
from sql.sql import create_connection, select_all_data

conn, cursor = create_connection(DATABASE_FILE)
display_formatted_list_output(select_all_data(cursor, "encrypted_data"), store_key(MAIN_DIR))

While it may be a good learning experience to try creating a password manager, please please don't ever use it for anything remotely important. As @Xenos suggests, it doesn't seem like you have enough experience that creating your own password manager would really be beneficial anyway, it would likely be a better learning opportunity to take a look at an existing open source password manager.

16
  • 24
    I'm not even sure "creating your own password manager" is a good experience. I think that diving into an existing one and looking at how they work and how they are coded is a far more interesting and efficient experience, since you won't do mistakes (by trying to do your own PM, you will do mistakes and unless someone is rereading your code, you'll keep doing and believing in these mistakes)
    – Xenos
    Commented Jun 28, 2018 at 14:54
  • 12
    It's not going to be used for anything important, I'm just creating it in order to learn, I am only storing the master passwords hash value, not the password itself Commented Jun 28, 2018 at 15:29
  • 11
    Compared to closed-source 3rd party password managers, creating your own password manager, as long as you use other peoples' proper password encryption functions for the stored password and hashing function for the master password, is, imo, a much more secure and paranoid option than blindly trusting that someone else's will be secure, not have government backends, and not sell the passwords on the black market. Obviously, this logic does not hold for open-source 3rd party ones though.
    – kloddant
    Commented Jun 28, 2018 at 18:06
  • 5
    @CertifcateJunky: For what it's worth, getting the crypto wrong is only a problem if you're defining your own file format. It's kinda hard to mess up the crypto (KDF etc.) if you're working on existing password database formats (like KeePass's KDBX)... it doesn't take a rocket scientist (cryptographer?) to compare the original XML and your new XML and check that your code doesn't add/remove extraneous information. So if you're just trying to improve the UI and learn how it works, it might be worth sticking with an existing file format. As a bonus you get interoperability too.
    – user541686
    Commented Jun 28, 2018 at 18:50
  • 2
    @Mehrdad Not 100% true, as you could still use a bad PRNG instead of a good CSPRNG for salt or iv generation, allow secrets in memory to be swapped, etc which would preserve compatibility while still being insecure. A good point though nonetheless. Commented Jun 28, 2018 at 19:07
22

First let me disclose that I work for 1Password, a password manager, and although it may seem self-serving, I have to add my voice to those telling you that writing a secure password manager is harder than it may first appear. On the other hand, it is good that you are trying and asking about it in public. This is a good way to learn that it is harder than it may first appear.

Don't authenticate. Encrypt instead

I've taken a quick glance at the source you link to in your update, and am delighted to see that you have taken on some advice that has been offered so far. But unless I'm misreading your code, you still have a fundamental design error making it massively insecure.

You appear to be treating the master password entry as an authentication question instead of as encryption key derivation. That is, you are checking the entered password (after hashing) with something that is stored, and if that check passes you are using that stored key to decrypt the data.

What you should be doing is storing an encrypted encryption key. Let's call it the master key. This should be generated randomly when you first set up a new instance. This master key is what you use to encrypt and decrypt the actual data.

The master key is never stored unencrypted. It should be encrypted with what is sometimes called a Key Encryption Key (KEK). This KEK is derived via your Key Derivation Function (KDF) from the user's master password and a salt.

So the output of your use of PBKDF2 (thank you for using that instead of just repeated hashing) will be your KEK, and you use the KEK to decrypt the master key, and then you use decrypted master key to decrypt your data.

Now perhaps that is what you are doing and I've misread your code. But if you are just comparing what you derive through your KDF to something stored and then making a decision whether to decrypt, then your system is massively and horrendously insecure.

Please make the boldest biggest first line of the project README scream the fact that it is massively insecure. And add that to every source file as well.

Some other points

Here are just a few other things that I noticed when glancing at the source

  • Use an authenticated encryption mode such as GCM instead of CBC.

  • Your approach of just encrypting the whole thing will work for small amounts to data, but it won't scale once you have larger data sets.

    When it comes time to encrypting (parts of) records separately, keep in mind that you will need a unique nonce (for GCM) or unique initializate vectors (if you unwisely stick with CBC) for each record.

  • Your data destruction after 3 failures is dangerous and adds no security.

    A sophisticated attacker will just copy your data file and write their own script for trying to crack the password. They will also make their own copy of the captured data. Your data destruction only makes it easy for someone to accidentally or maliciously destroy your data.

How many rounds of PBKDF2

So to your original question. The answer is that it depends. On one side, it depends on how strong the master password is and what sorts of resources an attacker will throw at the problem. And it also depends on what resources the defender can throw at it. For example, will you be running your KDF on a mobile device with limited battery capacity?

But as it happens we at 1Password are trying to gauge the cost of cracking by offering prizes to people or groups who crack a 42.5 bit password hashed with 100,000 rounds of PBKDF2-HMAC-SHA256.

Diminishing gains from slow hashing

But what's more important than figuring out how to weigh all of those is to understand that slow-hashing, while absolutely essential for a password manager KDF, yields diminishing marginal gains once it is tuned high enough.

Say you are using 100K rounds and you have some master password, P. If you picked a random digit, 0–9 and appended it to P you would get a ten-fold increase in cracking resistence. To get the same increase via increasing the rounds of PBKDF2, you would need to go up to 1 million rounds. The point is that once you have a decent amount of slow hashing, you get far more strength for defender effort by increasing the strength of the password than you do my increasing the number of rounds.

I wrote about this some years back in Bcrypt is great, but is password cracking unfeasible?

6
  • 1
    It's defiantly not easy. In the README it tells you to not use it, I can add something saying that it is insecure and a work in progress at the top though. Also, you have not misread the code, what I'm doing is hashing the key, and comparing against that stored hash. I see now that, that probably isn't the best solution. Thank you for your answer. Commented Jun 29, 2018 at 21:10
  • @CertifcateJunky: Always consider what someone could do by changing your source code and recompiling. This makes it obvious why you can't just check the entered password and then allow use of a key that was already stored on disk: Change the code to bypass that check and you still have a working decryption key, if they key bits don't actually depend on the unlock password. This is the difference between just authenticating the user vs. requiring them to actually supply a secret that turns into the key. If your code works like that, it goes way beyond "probably isn't the best solution". Commented Jul 1, 2018 at 3:29
  • KEK... that made me laugh.
    – Dev
    Commented Jul 1, 2018 at 5:32
  • @PeterCordes I've been working on it more, and have came up with an idea (mostly because of everyones advice) of keeping the master password, but only using it as verification (for now) and generating a "key" from that password with random characters and what not. The key is only decrypted and used if the master password hash matches the stored hash, also the key is never stored decrypted (unless you count memory which is a whole other ball game). That "key" is used to encrypt the passwords instead of the master password. Commented Jul 2, 2018 at 16:13
  • 1
    @CertifcateJunky: Yes, turning the master password into a decryption key for the main key is the right kind of idea. This answer suggests some good ways to turn passwords into keys. Commented Jul 2, 2018 at 18:46
6

Do not use raw SHA256 - it is able to be accelerated using GPUs and therefore prone to brute force. It isn't broken, by any stretch, it just isn't best practice any longer. PBKDF2 works to counteract that to some extent, but you should use bcrypt or scrypt preferentially.

Reinventing the wheel (which is pretty much what you're doing if you aren't using PBKDF2-SHA256, bcrypt or scrypt) is never a good idea in crypto.

2
  • 2
    @Azxdreuwa The ASICs for scrypt (used for litecoin) are possible because weak parameters were chosen. The real problem with scrypt is that it scales memory and cpu usage together, and it was designed for a disk encryption context where parameter can be higher. In a web context where you need a lower cost, it's easy to end up less secure that bcrypt, even if still hopefully resistant to ASICs. Argon2 fortunately has separate parameters for cpu and memory cost. Commented Jun 28, 2018 at 18:10
  • 3
    In case it wasn't obvious from my previous comment, those ASICs couldn't be used for scrypt paramaters other than those chosen for litecoin (which are particularly bad). Commented Jun 28, 2018 at 18:31
3

Looking at the code in the link you posted (reproduced below, in parts), I'm a bit confused.

If I read it correctly, main() calls store_key() to load a key from a file on the disk, then uses compare() to compare a user-given key against that one. Both run through sha256_rounds() which runs PBKDF2 on them.

Then you use the same stored_key, the one loaded from the file, to do encryption?

That's completely and utterly backwards. Anyone who gets access to the key file could just load the stored key, and use it to decrypt the stored data, or store new data encrypted under the same key, without having to go through the script to "verify" the password.

At best, I think you're confusing using the user-given password/passphrase to produce a key for encryption, and authenticating the user against a stored password hash. You can't do both with the same hash, the encryption is completely irrelevant if you've stored the encryption key along with the encrypted data.

Sure, passwords are processed with some sort of a hash/KDF both when used for authentication and when used to produce an encryption key. But that's more like a step of pre-processing necessary just because humans can't memorize 256-bit keys, and we have to rely on low-entropy passwords instead. What you do with the hash/key you got does differ between the two use-cases.


Then there's the thing that you have a hardcoded salt in sha256_rounds(), which pretty much defeats the purpose of a salt in general. I could comment on the code more generally, too, but this isn't codereview.SE.

Please tell me I read the code wrong.


The parts of code I looked at:

def main():
    stored_key = store_key(MAIN_DIR)    
    if not compare(stored_key):
        ...
    else:
            ... # later
            password = prompt("enter the password to store: ", hide=True)
            encrypted_password = AESCipher(stored_key).encrypt(password)



def store_key(path):
    key_file = "{}/.key".format(path)
    if not os.path.exists(key_file):
        ...
    else:
        retval = open(key_file).read()
        amount = retval.split(":")[-1]
        edited = retval[16:]
        edited = edited[:int(amount)]
        return edited
3
  • nope that about sums it up. I didn't say it was secure, just that I was making one :P, I see your point though. Commented Jun 29, 2018 at 14:44
  • 1
    @CertifcateJunky, well, if you're happy with not saying it was secure, you could just drop the AES/SHA calls for good and concentrate on storing and loading the data. If you do consider the notion of playing around with the crypto primitives, at least start with a high-level description of the algorithms and data the system uses, what parts are secret and what affects what. And go read some descriptions of how existing encryption utilities (etc) work, and consider why they do what they do.
    – ilkkachu
    Commented Jun 29, 2018 at 15:17
  • I think you're misunderstanding what I said and not taking it the way I meant it. I completely understand what you're saying and am actually trying to figure out a way to edit it. It was a joke, not a serious statement. Commented Jun 29, 2018 at 15:45
2

Use a maintained tool that will decide for you and will increase the default count or even algorithm over time. I'd suggest libsodium, but there are others. Libsodium has defaults for "interactive" password hashing and for symmetric encryption. Ensure your software can some day "upgrade" the settings as needed. A tool like libsodium will change their defaults from time to time so you can follow them.

I do development for an open source password manager and use libsodium. Also I found Mitro's security document worth reading.

3
  • 3
    The point of the tool is to learn how hashing works and how password managers work. I'm not suggesting anyone use this tool, because well, I WROTE IT (lol). But I think it's a good practice in order to determine how things work. If you're interested, you can see it here Commented Jun 29, 2018 at 14:07
  • 2
    For sure - I totally get that. Go write it, make mistakes, and learn! I would still suggest looking at other software like libsodium to learn what they use. Typically a password manager is dealing with the concern of storing passwords and presenting a UI. Almost all of them will use other software libraries to do the actual encryption. Separation of concerns. Thanks for the link! I see you are using python - you might be interested in this crypto101.io which uses Python in examples.
    – Bufke
    Commented Jun 29, 2018 at 14:11
  • I'll look at it. I am curious how they keep their encryption up to standard. Thanks for the link! Commented Jun 29, 2018 at 14:24

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .