Skip to main content
added 564 characters in body
Source Link
dr jimbob
  • 39.4k
  • 8
  • 95
  • 164

EDIT:I just noticed you listed £ as one of your common symbol. Unfortunately that's not a standard ASCII symbol. In ISO-Latin-1, it's the byte A3. In UTF-8 it's the two bytes C2 A3. In UTF-7 it's the ASCII characters +AKM-. In UTF-16 it's 00 A3. These different encodings mean that your hash function may break on this character if it's not handled properly. Granted, the application should be able to handle encoding properly, but it could fail on some subset of devices. Furthermore, the character may not be available on foreign keyboards.

There also may be usability issues with characters like ' or " that in some applications/platforms may be converted to smart quotes ‘’“” (though this should never be done in a password context).

There also may be usability issues with characters like ' or " that in some applications/platforms may be converted to smart quotes ‘’“” (though this should never be done in a password context).

EDIT:I just noticed you listed £ as one of your common symbol. Unfortunately that's not a standard ASCII symbol. In ISO-Latin-1, it's the byte A3. In UTF-8 it's the two bytes C2 A3. In UTF-7 it's the ASCII characters +AKM-. In UTF-16 it's 00 A3. These different encodings mean that your hash function may break on this character if it's not handled properly. Granted, the application should be able to handle encoding properly, but it could fail on some subset of devices. Furthermore, the character may not be available on foreign keyboards.

There also may be usability issues with characters like ' or " that in some applications/platforms may be converted to smart quotes ‘’“” (though this should never be done in a password context).

Source Link
dr jimbob
  • 39.4k
  • 8
  • 95
  • 164

Agree with your analysis that allowing symbols allows for more security, but generally it's not that much. Especially when compared to going to slightly-longer passwords (assuming the password is completely randomly chosen symbols). Using any of the 95 printable ascii characters:

0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|} ~

(a few more if you count characters like tab or linebreak as printable) an 8-character password has 95^8 ~ 6.6 x 10^15 possibilities, while with only (case-sensitive) letters and numbers an 8 character password has 62^8 ~ 2.2 x 10^14 which is about 30 times weaker.

However, a 9-character password with just numbers+lowercase+uppercase is two times stronger than an 8-character password allowing special characters. Thus, unless there are hard limits on the length of a password (and really there never need to be at least for passwords less than a few hundred characters), it is easy to move to slightly longer passwords even with a limited character set.

The biggest potential security concerns is if they believe allowing special characters in passwords could cause problems in their application or database. There is a legitimate reason to exclude non-printable ASCII characters (e.g., ASCII control characters NUL (\0), backspace (\b), etc.) that could cause problems, but well-designed applications should be able to handle regular special characters like ' or - without being vulnerable to injection attacks. An application should be able to handle these types of characters as they appear for example in names with quotes or hyphens in them (e.g., Conan O'Brien, Daniel Day-Lewis). Furthermore, as passwords shouldn't be saved to the database or ever given back to the user and just immediately hashed, allowing printable ASCII special characters shouldn't matter.

Granted there are some usability concerns with non-ASCII special characters like unicode, and for usability concerns it may be a good idea to either forbid these characters or normalize them in a standard way. Hash functions typically expect a string of bytes, and passwords with special characters beyond ASCII can be encoded in different ways at the byte level (e.g., UTF-8, UTF-7, UTF-16, ISO-8859-1). Furthermore, besides different encodings (which you could keep consistent at the application level), you also have to worry about identical looking letters having different values in unicode. For example the following character Å is unicode 00C5, but this identically looking character is Å unicode 212B while this Å is actually two characters -- an ascii A with a combining character of unicode 030a adding a circle over the A.

There also may be usability issues with characters like ' or " that in some applications/platforms may be converted to smart quotes ‘’“” (though this should never be done in a password context).

Finally, there's one additional rational for having unique password rules -- make it harder for users to have one remembered password that is re-used everywhere, which is a horrific security practice. If the user's "normal" password doesn't meet one site's unique rules, then when their normal password is compromised on some random other site (that say stores the password in plaintext), their account isn't compromised at the site with the unique rules.