3

Entropy/Length/Complexity of a password is pretty straight forward and cant really vary much. For Dictionary Similarity, i would assume that a software just checks how many characters in a password would need to change to match any Dictionary password, or e.g. if moving all letters forward / removing dots / changing numbers to letters creates a Dictionary password.

My confusion stems from the fact that the Password Depot 16 "Quality Analyzer" tells me that a certain password has 100% Dictionary Similarity. Now, i know that a Password Dictionary doesnt consist of actual words like a real one.

The password (not a security concern anymore) is: AT78EHpsMe9

I put this into one of the many online password check tools and it gave me this result:

'AT78' + 'EHp' + 'sMe9' is not a safe word combination. The word is composed of three components: 1) The string 'AT78' follows the pattern [dictionary word][one or two digits].2) 'EHp' is a dictionary word.3) The string 'sMe9' follows the pattern [dictionary word][one or two digits].

That seems weird to me. If "AT", "EHP" and SME", three totally random letter combinations, are part of a dictionary, then i assume this is true for many many other 3-letter combinations. That doesnt make a password unsafe? You could argue it doesnt have special characters, but i dont get the reasoning above. To make sure i tested it on a more reputable site, but i got a similar result:

Your password is easily crackable. Frequently used words

This site even claimed it could be cracked "faster than the time it takes to get back from a short walk"? I personally dont count "AT" "EHP" and "SME" to my "frequently" used words, whats that about?

So my initial confusion was just, "what is Password Depot 16 actually checking Dictionary Similarities with" - but assuming that it just uses the same sources as those two sites, i want to know, is this just a false positive from the algorithm, or is that password actually unsafe, just because it has gibberish 3-Letter "words" that are matched in a Dictionary?

2
  • The first link reports your string as "fairly good".
    – schroeder
    Commented Oct 10, 2022 at 8:39
  • I would refocus the question to ask what the difference in entropy is between common 2-3 letter words strung together with numbers and a random string.
    – schroeder
    Commented Oct 10, 2022 at 8:43

3 Answers 3

3

'AT78' + 'EHp' + 'sMe9' is not a safe word combination. The word is composed of three components: 1) The string 'AT78' follows the pattern [dictionary word][one or two digits].2) 'EHp' is a dictionary word.3) The string 'sMe9' follows the pattern [dictionary word][one or two digits].

This seems pretty dubious, but let's do some rough maths to get an idea of it. AT is obviously common, but EHp and sMe are somewhat less so. From a quick search of the common rockyou lists (~14 million entries), none of those three terms are in it exactly.

But let's be generous and assume that they're all in a wordlist, which has 20,000 entries.

The first fragment is a one entry from our wordlist plus two digits (so 20000*10*10), the second is in our wordlist (20000) and the third is in our wordlist plus one digit (20000*10). Multiple them together and we get 8,000,000,000,000,000, which is roughly 2^49.

This means that an attacker knows that we formed out password this way, and they have the exact 20,000 wordlist that happens to contain these three fragments, the password has ~49 bits of entropy. For comparison, eight mixed-alphanumeric characters has ~48 bits of entropy.

So by making some very generous assumptions about how knowledgeable our attacker is, we have a password that's effectively impossible to brute-force if it's stored properly (bcrypt, PBKFD2, argon2id, etc), or that could be brute-forced on a GPU if it's not (MD5, SHA-1, NTLM, etc).

But in reality it would be much harder to crack than 8 random mixed-alphanumeric characters - because an attacker is unlikely to have that knowledge and that perfect wordlist.

1

That Dictionary attack check is suspect. It was probably intended to be a numbered list (which might say something about the site…), so I'll just fix that:

'AT78' + 'EHp' + 'sMe9' is not a safe word combination. The word is composed of three components:

  1. The string 'AT78' follows the pattern [dictionary word][one or two digits].
  2. 'EHp' is a dictionary word.
  3. The string 'sMe9' follows the pattern [dictionary word][one or two digits].

Password entropy applies to password schemes

Given a sample password, deriving its complexity requires guessing its scheme. This can be obvious, as it is for password (a word), pa55w0rd (a l33t word), password123 (word + number), or it can be nontrivial. It therefore seems reasonable for a calculator unaware of password schemes to seek words.

Let's look for words

Finding words in a passcode and therefore concluding it's weak is not a great approach. Even if we were to assume "EHp" and "sMe" are common enough to be alongside "AT" in a standard 100k-word dictionary (and they're not even close), this unfairly inflates their entropy—the entropy of a common dictionary word is log₂(100000) = 16 but the entropy of three random letters is only log₂(26³) = 14. To account for random case within a word, multiply by two to power of the word's length (a three-letter word like EHp has 2³ = 8 case iterations: ehp,​Ehp,​eHp,​ehP,​EHp,​eHP,​EhP,​EHP), so for "EHp" and "sMe", that'd be log₂(100000×2³) = 19 vs log₂(52³) = 17.

Entropy calculations must be worst-case, so small words in an unknown password scheme should be considered random letters.

This answers your direct question—the password in question cannot be analyzed for its "words", so you can't conclude anything like being "unsafe" on that ground.

If those were words...

If there were longer words, we could talk about how large a dictionary would be needed to crack them. A word too rare to be found in a spelling dictionary should be considered worth roughly three random characters while a word in your spelling dictionary should be valued around two characters. Add one if you use typos, l33t, etc. Never value a word over four random characters.

That password checker's "dictionary attack check" is too aggressive in finding words and too naive to know that words can be good. If we actually used words in this manner, say with kAyaK78CInEMaquiCHe9, the overall entropy is much higher: three words with random case and three digits is log₂(100000³×2¹⁷×10³) = 76. The check still says it's "Not safe!" even though I'd call that equivalent to a password with 11 random characters.

Assume a scheme of 11 chars including a capital, a lowercase, and a number

This is the way I'd prefer to calculate this particular password. Sadly, we have to assume there are no special characters in play, so the entropy is simply the required upper, lower, and digit plus eight random characters that can be any of those groups: log₂(26×26×10×62⁸) = 60. That's not bad, but it's also not great.

If the requirement were a capital, a special, and a length of 11+, the entropy calculation would be log₂(26×32×94⁹) = 68. Good, but still not great.

(See how password requirements can actually lower entropy? If you're creating these requirements, take a minimum satisfactory password length, say 10, and add one for each character type you require.)

Use a password manager and a generated code!

This is the only way to have a secure password nowadays. With a password manager, you don't have to worry about memorability, so you might as well crank the generator and make a nice long 20-character code: log₂(26×32×94¹⁸) = 127.

0

Every tool will evaluate differently and it's up to you to decide.

The first tool you looked at seems to be trying to sell you a generated password and be not be very reliable. The same password you tested (AT78EHpsMe9), on this website says it will take 41 years to crack. Is that not secure? Up to you.

The Kaspersky tool seems to be trying to sell you something too which is why right after you enter the password a banner to redirect you to another page regarding password complexity is displayed.

Regarding the dictionary attack - as far as I know, dictionary attacks can only work on the entire string as it is comparing the hash. Therefor, I would ignore anyone that is trying to tell you that a sequence in the string matters when it is only two characters out of many.

The chance of the password AT78EHpsMe9 being in an already made dictionary is low but the calculation is based on if you generated a dictionary using A-z and 0-9 how long will it take to generate that specific string.

I would suggest using a neutral website such as this one provided by the University of North Carolina.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .