3

I have created a random password generator function (which can be found here if anyone wants a look), which will churn out passwords with a random mix of letters, numbers, and other characters. This question and it's answer suggest that I should select the first chosen password because otherwise my passwords become more predictable based on what I like. But theoretically what if one came out with only letters or only letters/numbers or only one special character. Should I then generate another password or am I losing entropy? I feel as if the first password would be vulnerable to wordlists, but not using it technically makes my passwords more predictable. Is there is good place to draw the line?

P.S. Unlike that other question I do not care how hard these passwords are to remember.

2

2 Answers 2

8

In general

This depends on what information you are asuming that the attacker has.

First, let's asume that the attacker is blind, and perhaps trying to crack a large dump of breached accounts, without knowing that you used that specific algorithm. Then you would be better protected if you discarded 123456 if it comes up, or more realistically, passwords that are only lowercase, dictionary words, etc. This is for the simple reason that any attacker is bound to try those first since in general they are more common.

On the other hand, if the attacker knows that your algorithm was used, discarding passwords will only make her job easier. If you dump 10% of the passwords, that is 10% she does not have to try, saving her 10% of the time on average.

So which one of these two scenarios are more likely? I would say the first one in most cases, but only you can determine what your threat model is.

A mathematical example

Let's do some math to see how many passwords you would actually drop. Let's say that characters are picked from three groups (upper case, lower case, numbers + special characters) and let's say that there are an equal number of character in each group. Furthermore, let's say that you require passwords to have at least one character from each group. What fraction of passwords would you drop?

If I get my math right, it is 3*(2/3)^L where L is the length of the password. For L=10 you get 5%. For L=20 you get 0.1%.

Unless you are completely sure that the only threat is an attacker that knows how the password was generated (I find it hard to imagine how you could be that), I would say that reducing the search space by 0.1% is worth it.

1

While @Anders answer is accurate, I want to extend his case for dropping "low entropy" passwords and I couldn't fit it in a comment.

Firstly, I wanted to introduce a parallel. Many ciphers (e.g. DES) have weak keys, which make encryption behave suboptimally. This implies that there is no "flat keyspace" (one where all keys have the same "strength"). If there are enough known weak keys, these are often added to a reject list and discarded. The only time this is not done is if the number of weak keys is infinitesimal. I agree that there is a big difference in encryption keys v/s passwords, but the principle of discarding weak keys/passwords should still apply.

Secondly, let's take the math further. Based on Anders' (perfectly valid IMHO) math, 2% of 12 character-long passwords can be characterized as "weak". This is not a trivial number. Say users with a weak password have a 5% chance of being brute-forced, whereas a strong password has a 0.1% chance of being brute-forced.

Originally, 0.2% of the accounts that you generate passwords for will be broken:

0.02 * 0.05 + 0.98 * 0.001 = 0.00198 ==> 0.2%

But half of these broken accounts have weak passwords. If you discard those 2% from your generator's output, the number of successful attacks will be only 0.1%

1.00 * 0.001 = 0.001 ==> 0.1%

Thus, by removing 2% of passwords, the number of successful attacks will go down by 50%.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .