35

I often read as an advice to build strong passwords, just to think about a sentence and then take the initial letters. For example take a nonsense sentence like "I watch Grey's Anatomy at 9.40" gives me the password "IwG'[email protected]".

How secure is it if I take instead of this just the whole sentence (including white spaces). To be more concret:

How secure is it to use just an ordinary english sentence as a password with in particular with respect to

  • a sophisticated directory attack
  • a brute-force attack

If it is a good idea to do so, are there any rules I should follow to build the sentence? (Number of words, Is it ok if it is a quote from a famous person or has it to be a nonsense sentence...)

How do passwords of this type compare to a just randomly choosen passwort consisting of lower and upper case letters, numbers and symbols of length n?

I have four places in mind where this scheme should be applied:

  1. Your home computer
  2. Internet accounts (email, online shops, social networks,...)
  3. Internet Banking
  4. Storing highly sensible data

The password should be secure enough to follow the technical progress in password cracking and computer hardware for at least two years.

How appropriate is the described password building scheme in those cases, how would one change the recommendations on the sentence length etc. depending in which area the password is used?

Would be great if the answer contains some calculations which estimate the password security and some references about this topic.

6
  • 24
    correct horse battery staple
    – wim
    Commented Oct 17, 2012 at 11:25
  • 2
    I'm picturing sensible data wearing sensible shoes.
    – aslum
    Commented Oct 17, 2012 at 16:15
  • 2
    Defining security of passwords using mathematics is very flawed process. It only yields results based on mathematics for the methods of breaking passwords you know. E.g. a 10 character alphanumeric and non-alphanumeric password was brute forced at a place I used to work, but a simplistic 8 character word+"123;" was not broken. So it's all about the methods used and any questions that you ask about "how secure * is" should be accompanied with "within the remit of the * method".
    – chkdsk
    Commented Oct 17, 2012 at 16:19
  • 3
    Regarding: "The password should be secure enough to follow the technical progress in password cracking and computer hardware for at least two years." Against what type of attacks? And against what adversaries? That could be a near impossible task against "everything". But a significantly possible task if the scope is limited. Remember this: lightbluetouchpaper.org/2012/09/03/… ?
    – chkdsk
    Commented Oct 17, 2012 at 16:24
  • Excellent points made by PrashantGupta. Another aspect of this is that, if a password hash is assumed to require e.g., 10,000 hours to crack by brute force, that duration is the best case scenario. It assumes that the successful guess will occur at the end of the 10,000th hour of the cracking effort. Commented Oct 18, 2012 at 15:01

12 Answers 12

30

Would be great if the answer contains some calculations which estimate the password security and some references about this topic.

I very recently answered almost the exact same question here: Confused about (password) entropy

The time it takes to crack your password is exactly equal to the amount of time it takes to test a single password multiplied by the number of passwords that will be tried before yours. Since you're attempting to predict someone else's behavior, that's really all you can say.

There are other estimates that try to figure out what number this will be in some general case, but those are necessarily always wrong, since the attacker doesn't have to follow whatever mathematical model you construct. Clearly the password will be guessed quickly if the attacker is going off a list of keystrokes from your computer, and clearly the password will never be guessed if the dictionary the attacker is using doesn't contain your password at all.

Occasionally people will use the concept of entropy to make this kind of estimation, but again, in the real world how well does that hold up? A password consisting the letter 'a' repeated 27 times has very low entropy, but would withstand just about every dictionary attack in common use... until extremely long single-letter passwords become all the rage and attackers start looking for it.

The more common a password pattern becomes, the more likely attackers will look for it.

In general, password attacks try passwords in this order:

  1. commonly used passwords
  2. simple dictionary-based passwords (lowercase letters only),
  3. more complex dictionary-based passwords (mixed case, sprinkle in numbers and punctuation according to some common patterns)
  4. exhaustive search of the entire keyspace starting with short passwords and progressing to long ones

If you can withstand the first 3 types and you have a reasonably long password, you're pretty much home free because an exhaustive search of a this size of keyspace is infeasible. Most attacks stop after types 1, only concerted attacks even attempt types 3, and types 4 is desperation.

15

I think the best reference is NIST SP 800-63 Appendix A, which lays out the theory and the calculations. NIST assumes that the dominant defensive strategy is entropy, and that passwords with maximum entropy are the strongest. Steve Gibson's password haystacks challenges that assumption and asserts that length is more important than complexity or entropy (in part due to the magic of hashing). For your purposes, I think that it is sufficient to assume that the strength of an authentication credential (password/passphrase/etc.) is derived from both length and entropy.

A sentence is stronger because it is longer. Granted, English text is highly redundant Approximately 1 bit of entropy/character, most attackers will fail to take advantage of that entropy. I have been out of pentesting for about five years now, but at the time when I last did any pentesting, attack tools assumed that the password was more similar to a word than a sentence. Password length to entropy is not a linear function for reasons that Henning Klevjer has explained fairly well, and the attack tools take advantage of those limitations. (IIRC, the issues that Klevjer raises can result in a 100fold increase in password cracking speed).

Based on those assumption, the sentence as a passphrase is particularly strong. As others have pointed out, researchers have attacked passphrases, but I'm not aware of any published information that real world attackers have done so.

However there is a significant limitation to the passphrase. The relying party (the site to which you're authenticating), must accept a passphrase. In my personal experience a significant fraction, possibly a majority of authentication sites will not accept a sentence as a passphrase. (I would appreciate anyone who can point me towards hard numbers on this) Many password implementations either explicitly reject passwords of more than 16 characters, or else truncate longer passphrases to the desired length.

Examining your use cases in turn:

  1. Home computer - Depends on your OS, but there are non-password authentication mechanisms which are far stronger and far simpler. (biometrics and hardware tokens e.g. Yubikey )
  2. Internet accounts (email, online shops, social networks,...) - I doubt that you'll be able to use a sentence passphrase. Many if not most of these will not accept a passphrase - your only hope is to maximize the entropy of the password you supply and to avoid re-using the password
  3. Internet Banking - As above, unlikely to accept a passphrase. However increasing number of websites accept two-factor authentication from RSA or Yubikey or soft keys like google authenticator.
  4. Storing highly sensible data - I'm not sure what you mean here. If you are storing highly sensitive data, your best bet is offline, or encrypted. If you're talking about real encryption, then a sentence should be stronger, if the encryption product will accept it. Personally, I'd go for two factor authentication here and skip the password.

Also note that the strength of the password is meaningless if the relying party has a brain dead implementation - for examples see Sarah Palin Hack or Mat Honan. Your ultimate entropy cannot protect you against a negligent relying party. If you make the authentication credential strong enough, a targeted attacker will resort to an alternate method (of course you can cost him time and deter the opportunistic attacker). In such cases you must devote attention to both preventing the compromise and detecting/recovering from the compromise. But that's outside the scope of your question.

Please don't be distracted from the goal - if your real goal is strong authentication, then your best bet is to use a federated identity credential with a high level of assurance, and use a two factor authentication for that identity.

1
6

Full sentences for passwords, with or without spaces are known as passphrases. They are more secure simply because their length disables any contemporary brute force attack. That requires, however, that the attacker is unaware of the structure of your passphrase. A foolishly written password policy may require "a passphrase consisting of at least two words" which makes the whole thing easier. Say it takes X seconds to guess every one-word english passphrase, doing the same for two words takes X2. Dictionary attacks on passphrases, when the attacker is aware of the structure, may be very efficient.

A good idea is to use a passphrase such as "Hope no one notices my feet smell" and add some haphazard misspellings or special characters. So "Hope naune nytices mafit smell" is a better password if the attacker may guess or know the structure. This is more important for shorter phrases. "Yikes!Duperlarge%Dactionery" will not be vulnerable to a dictionary attack and is long enough to evade brute force attacks.

Bad practices (nonexhaustive list):

  • Short passwords
  • Few straight dictionary words
  • Substituting characters with similar signs (i.e. pa$$w0rd). "Pissword" is almost as good..

  • Numbers in sequence

Historically, a strong password is a long, random one. Since not all of us are able to remember "%W¤GHAF034jio43Q¤#%q3æPÅJ(%" as well as "LookattemGo,thefatteys!", I suggest passphrases. And write them down if you can do so securely.

To fully answer, you could use a quote, just not "unsanitized". In the LinkedIn case (LinkedIn lost a lot of password hashes), quotes from the bible and movies written directly as password were successfully cracked.

Conclusion: Think about the quote "There is no such thing as a good tax." and make it something similar. "No_sych_thang:taxorama", just don't spell it out directly.

"Your mother smells of almond toilet spray", by length is sufficient, but again, if the password policy says "Passphrase should be seven words and offensive to someone's relative"...

For your home computer, if you use Windows, you can have Ctrl+backspace as a password, and noone will probably every try it.

3
  • 2
    In case of quotes: Do crackers actually use directories of whole quotes? For example if I cite Einstein like: "Two things are infinite: the universe and human stupidity; and I'm not sure about the universe." This seems to be unbreakable if one uses brute force trying every combination of characters. It also seems to be unbreakable if one tries to combine words from an english directory containing say 7000 words because then he should try 7000^16 combinations + applying word mangling rules. However using a directory of famous quotes, it should be cracked instantaneously
    – student
    Commented Oct 17, 2012 at 7:47
  • 3
    It has been done (at least in research (see link)), so I wouldn't take the chance. securitynirvana.blogspot.no/2012/06/… Commented Oct 17, 2012 at 7:54
  • 2
    I'll sure be trying Ctrl+backspace on every Windows PC I see now!
    – SpellingD
    Commented Oct 17, 2012 at 18:54
3

What matters is not the length of the password, but its entropy. The entropy of a password is the expected number of attempts that an attacker will have to try before finding your password in a brute force attempt. (“Expected” number of attempts because a realistic model of the attacker is probabilistic — if you knew exactly in what order the attacker was going to enumerate the password candidates, you'd just pick one that's far enough on the list so that he'd never find it in your lifetime.)

If you apply a predictable transformation to a password pattern, the entropy of the password doesn't change by much. If the transformation is many-to-one, the entropy decreases accordingly. The amount by which the entropy might increase represents the propensity for the attacker to hazard the guess that you might apply this transformation. Memorable transformations like taking the first letter or 1337speak are likely to be tried pretty fast (guess what, the people who write password cracking tools are neither stupid nor ignorant). So a transformation like you describe might add a bit or two of entropy at most — or it might reduce the entropy considerably: if the attacker decides that it's more likely that people apply such transformations, he'll try the short form first; and if the transformation collapses many passphrases into one, that reduces the number of attempts accordingly.

Having punctuation does not intrinsically make a password more secure. The advantage of a transformation like the one you describe is that it makes your password shorter to type. Note that it may be harder to type even if it's shorter, especially on mobile devices with on-screen keyboards on which punctuation is somewhat out of reach. The full sentence has at least as much entropy, and removing all of its punctuation, spaces and setting all letters to lowercase does not reduce the entropy significantly.

A good password scheme is to take several random dictionary words and string them together. Note that it is vital that the dictionary words are chosen at random. Using a meaningful sentence aids memorization, but it also helps the attacker. An advantage of using a simple random password generation scheme is that you can evaluate its entropy easily. If your dictionary has 2^D words (D ≈ 10 for Basic English, D ≈ 14 for reasonably common words, D ≈ 19 for the OED) and you pick N words for your password, then your password has ND bits of entropy. Since the password generation is random, there is no way for the attacker to gain more information, he will have to make 2^DN/2 attempts to crack it on average. If some system requires special characters in passwords, stick a 1 at the end or a capital letter at the beginning — the entropy comes from the random words, not from special characters. This password generation mechanism is illustrated in XKCD 936, with a comparison with 1337speak (much worse); the comic has been further discussed on this site.

2

What would you do, if you had to crack a password and do not have the time to try every possible combination? You would probably do something like this:

  1. try often used combinations like "admin" or "12345"
  2. try single words from a dictionary
  3. try random combinations until about 7 characters
  4. try pairs of words from a dictionary
  5. Try well known quotes

Using several words instead of a single password is actually a good idea, but only because there are so many possible words available. You can calculate the possible combinations yourself:

Random characters:
26 characters in alphabet ^ 8 places = 2.0E11 combinations
52 case sensitive characters ^ 8 places = 5.3E13 combinations

Sentence with words:
135'000 words in dictionary ^ 4 places = 3.3E20 combinations

So using passphrases is good, as long as they are not often used (well known). The less words you use, the easier it is to crack, 4 words seems a minimum to me.

Using only the first letters from the words of a passphrase is more comfortable, because you have to type less. It is good as long as this leads to using more words. If your passphrase uses to few words, your password will fall into the category 3 (random combinations until ? characters).

2

The problem with cracking such, is that you dont know what context to grab sentences in to match the context that the person used to pick the password.

This becomes terribly difficult when using texts that have different sources such as PDFs, TXT files, Word Docs etc.

For example, say I have a book and it has the following two sentences:
Mary had a little lamb who's fleece was white
as snow. Mary then decided one day to make lamb
stew and so Mary no longer had a little lamb.

So, for my password I choose sentence 1 "Mary had a little lamb who's fleece was white"

But now as someone trying to crack that, I build my wordlist by obtaining the "BOOK" in TXT format. The text file looks like this:
May had a little lamb who's fleece was white as snow. Mary then decided
to make lamb stew and so Mary no longer had a little lamb.

See the problem here? My candidate is going to be "Mary had a little lamb who's fleece was white as snow. Mary then decided" - and it doesn't matter what I do to that sentence when hashed and compared it will never match the "real" one used initially.

So how would one know where to truncate sentences to build them into a dictionary? The possibilities are endless - sentences with 1 word, 3 words, 10 words, 9 words?

Obviously, it would be easy to simply go from one punctuation point to another like a "period" to a "period" - but thats assuming people choose sentences for passwords based on that? If they do I would be surprised (and concerned) - if they dont, then again it makes the possibilities rather difficult.

Perhaps with some coding one can generate sentence lists from a source based on numerous iterations of words+"x" in the sentence and then pipe that into a password cracker. Not sure of the effectiveness or speed though.

2

Assume the hash algorithm and method (salted/not) are the same. Assume you can build a hash list and use a GPU cracker like ocl hashcat against it (I have seen benchmarks claiming 59 Gps for guessing- that's Billion as in 55,077,000,000 guesses per second. That breaks the entire space of an 8 char password using 76 possible characters in about 3 hours.). Assume you have 3,000 common words in use. Assume that 300 of those are verbs (this is based on Internet published lexical analysis- ymmv).. Assume that verbs occur normally in the first three positions of a sentence. Assume common words like "to, two, for, four, one, won, five," etc are swappable with their numerical counterpart, and assume your punctuation will be spaces between words and periods, exclamation marks, and question marks (or nothing) at the end. Assume no capitalization except the first letter. Assume five word sentences (a nice average- even security-minded people seldom go higher than 12 in my guess).

You can crack that offline in about two months by using the words as chunks.

But, sentences chosen with names, uncommon words, intentional misspellings, symbols other than space, randomized capitalization, or nonsense words would be theoretically more secure because the entropy guidelines would break my assumptions.

This is all theory, of course. Many people to whom you give your password still don't hash them. Many still do not use a salt. Many others use proven insecure hashing algorithms. Some technologies don't require a hash be broken in order for it to be abused. Even secure providers fail to protect password hint answers or other channels into the accounts. And many good providers who genuinely try still even use bad captcha to "human limit" guessing.

So, I'd say: non-reuse (of hints, answers, or passwords) is more important, length does matter, and pressure your password custodians to use better practices. It's a better use of your time than a debate like this.

1
  • 1
    Assume that the "OP" is really interested in security and does all the things right, then it is a good question. Commented Oct 17, 2012 at 14:56
2

How about passphrases generated with diceware? From the discussion there the arguement is made that 5 or 6 words chosen at random from a list of 7776 words (the key is that they be trully random) have as much entropy as using the numbers from 5 or 6 rolls of 5 dice

1

If the initial letters of your sentence result in a password long enough to withstand almost all brute force attacks, in my opinion using the whole sentence can actually reduce the password security. My reasons are:

  1. The password might be truncated, in the worst case I know resulting in 8 characters with the low entropy of English.

  2. You will give observers more chance to observe you successfully while typing the password. Due to redundancy of Englisch, single missed keys in observation are easily corrected. If you are like me, you will also enter your password incorrectly more often due to the sheer length. This again results in bonus observations for the attacker.

  3. This is far-fetched, but anyway: The attacker might be someone who was able to observe you typing non-passwords for a long time. He can now know the rhythm you type Englisch words with. He cannot know the rhythm of a seemingly random password, as you would not type it in a non-secure (i.e. non password dialog) context. So, using English sentences could make an attack using solely a microphone possible. Or perhaps, just observing the timing of network packets transporting you password could be enough now. This is more difficult than using a microphone because of things like Nagle's Algorithm in TCP and needing some kind network eavesdropping capability. It could even be impossible of the password is only transmitted as whole or, as recommended, a hash.

1

Password entropy, and the complexity of the keyspace are really interesting to calculate, and are useful in understanding the password hash's likely survival duration under a brute force attack. While that is only one type of attack, and entropy is only one aspect of what may or may not be a "strong" password, it merits exploration. Here are a couple of calculations I did recently. Full disclosure: the following comes from my own blog, so if the moderators want to reject this answer, I won't take offense.

It is weird to think that a 7 character all lowercase password is better than a 5 character alphanumeric with punctuation:

95^5 = 7,737,809,375
26^7 = 8,031,810,176

The seven character lowercase password has slightly more (294,000,801) characters. But if you increase each type of password by one more character, the lowercase password has about 3 and half times fewer characters.

95^6 = 735,091,890,625
26^8 = 208,827,064,576

I wonder if they go back and forth like that as you add more characters to your passwords. I wonder if the hash cracking times are correspondent to this.

0

(Number Of Universal Set of Password Characters) ^ (password length) not always equals to possible combinations

Because the security of passwords are greatly affected by the password related algorithms.

For example, Windows operating system stores the passwords that is less than 14 characters with LM hash algorithm. LM hash algorithm pads each user password of less that 14 characters is with null characters to extend its length. Then the result is then split into two 7 character parts, each of which is encrypted separately. Along with a predictable parity value, the results are hashed, concatenated and stored. So from attackers point of view the worst possible combinations is: (Number Of Universal Set of Password Characters) ^ (password length)/2 + (Number Of Universal Set of Password Characters) ^ (password length)/2

Which is very greatly less then calculation above. Hope this gives you an other point of view when deciding a password is strong enough or not.

1
  • Although obviously LM is deprecated and no longer used (if you are still using NT or XP in compatible mode, you'd best look at updating...)
    – Rory Alsop
    Commented Oct 17, 2012 at 14:30
-3

Using a sentence instead of a single word is a passphrase, and as it will be longer than a password it is inherently harder to crack than a password from a brute-force perspective. BUT, without adding any complexity they aren't as strong as you might think as all an attacker needs to do is use words as building blocks instead of letters. Take the password Kafka57* as an example. It is complex and has 8 characters. Then take the passphrase "kafka wrote some great plays but strange", which is 42 characters in 7 words separated by spaces. It may seem stronger, but I would consider it weaker than the 8 character password in some ways:

  • It is based on dictionary words. If an attacker assumes that the passphrase will be structured that way it's a simple matter to build a rainbow table based on long strings of dictionary words. OK, the English dictionary is much bigger than a character set so there's more possibilities, but I would strongly suspect that an attacker could use a 500 word dictionary with great success.
  • It's low complexity. Without complexity of character set an attacker could use plain dictionary words.
  • It follows a predictable pattern. It's all lower-case and separated by spaces

I'm not saying it would be trivial to break the passphrase, it certainly wouldn't be, but there's ways to make it much, much harder. You need to use a non-standard or no word separator, and increase the character set. Turn "kafka wrote some great plays but strange" into "Kafk4_wr0t35umGr8 plays But*str4ng3" and you have something really complex and extremely strong, but complicated to type. Something like "Kafka wr0t Gr8-pl4yz!" is a good middle-ground as it is complex, doesn't have an obvious pattern, and is reasonably easy to remember and type.

3
  • Suppose you have 52 upper and lower case letter and 10 numbers. Then there are 62^8 possibilities to try to crack Kafka57. But for your sentence you can assume that you need say a directory with 5000 words. Then you need to try 5000^7 combinations, which is much more than 62^8. By the way Kafka57 seems to be pretty easy to attack by a wordlist attack with adding numbers...
    – student
    Commented Oct 17, 2012 at 11:52
  • @student, that's not the point I was making, which is that by using a set of dictionary words without modification you are simply replacing one building block with another. The important thing is to increase the complexity of the words in the passphrase, otherwise you can use a dictionary to attack it.
    – GdD
    Commented Oct 17, 2012 at 11:58
  • 4
    Wrong. Do the math — by changing from characters to words, the alphabet becomes a lot larger. See XKCD 936 and XKCD #936: Short complex password, or long dictionary passphrase? Commented Oct 17, 2012 at 12:59

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .