6
$\begingroup$

Given:

  • H is a good hash function with block size L.
  • K is a key of length >= L (recommended by RFC 2104).
  • Khex and Kbase64 are ASCII encodings of K.

In the HMAC algorithm, is there a good reason to prefer K over the other encodings?

Obviously the HMAC algorithm would consider these distinct keys of different lengths, and keys over length L are hashed, but the way HMAC works I don't see why using K would increase the security of the MAC.

$\endgroup$
2
  • 1
    $\begingroup$ IMO using text representations of keys is silly, but it doesn't cause any security issues. $\endgroup$ Commented Jun 9, 2013 at 7:39
  • 1
    $\begingroup$ From @fgrieu's answer I see I was naively conflating block length (B) with output size (L) of the hash. So now I see a key of length B could theoretically provide more entropy than the hashed key (length L) resulting from the encoding operation. $\endgroup$
    – Steve Clay
    Commented Jun 10, 2013 at 14:05

2 Answers 2

3
$\begingroup$

Yes, using K might be safer than using Khex or Kbase64, for a down-to-earth reason distinct from the valid theoretical reason pointed in Henrick Hellström's answer.

RFC 2104 says:

The authentication key K can be of any length up to B, the block length of the hash function.
Applications that use keys longer than B bytes will first hash the key using H and then use the resultant L byte string as the actual key to HMAC.

If K is B bytes (or somewhat less), it will be used as the effective key to HMAC, but Khex or Kbase64 will first be hashed to size L. Knowledge of H(Khex) or H(Kbase64), which is L bytes, allows forgery! Whereas, arguably, when K is used directly, the actual entropy entering HMAC is equivalent to K or 2L bytes, whichever is smaller (notice that K XOR opad and K XOR ipad are separately compressed to L bytes during the first round of each distinct invocation of H).

Update: I initially got B and L mixed up in the first sentence of the above paragraph; that's fixed, thanks to Henrick Hellström's comment. And I missed the reduction of entropy to at most 2L bytes; that fixed, thanks to CodesInChaos's comment.

When H is MD5, B=64 bytes, L=16 bytes. When K is random and 64-byte, that's 512 bits of effective entropy for K, of which arguably 256 bits are used, versus 128 bits when using Khex or Kbase64.

While one 128-bit key seems safe for one or two decades even accounting for foreseeable technology improvements (depending on the confidence level required), there are plausible applications of HMAC-MD5 where this matters right now: should the adversary be able to obtain the HMAC-MD5 of the same message hashed using $n=2^{30}$ random keys, a brute force attack using feasible amount of memory is expected to find one of the key after only $2^{130}/n=2^{100}$ MD5 rounds, and that's uncomfortably close to attackable with a sizable chance of success.

$\endgroup$
5
  • 3
    $\begingroup$ You can't get 512 bits of entropy with HMAC-MD5. The two keys get compressed before use, so they are reduced to 128 bits each. In some situations that might give a combined strength of 256 bits, but in others it won't. There might be meet-in-the-middle attacks, or producing collisions after the first step with 2^64 queries which enable a 2^128 brute-force search against the first key disregarding the second key,... In short I wouldn't rely on HMAC having a larger security level than the hash size. $\endgroup$ Commented Jun 10, 2013 at 10:07
  • 1
    $\begingroup$ You got the parameters L and B mixed up in the third paragraph. B is the block length of the to-be-hashed input, before passed to the compression function (64 bytes for MD5), and L is the hash output size (16 bytes for MD5). $\endgroup$ Commented Jun 10, 2013 at 10:26
  • $\begingroup$ @CodesInChaos: you are 100% right! $\endgroup$
    – fgrieu
    Commented Jun 10, 2013 at 12:36
  • $\begingroup$ From this answer follows the question: Wouldn't it be wise to truncate any key down to size B rather than let HMAC hash it down to L? $\endgroup$
    – Steve Clay
    Commented Jun 10, 2013 at 14:13
  • $\begingroup$ @Steve Clay: truncating a fully random key to B bytes is fine, but when that key is low-entropy (e.g. random hexadecimal encoded in UTF32, or worse a password), loosing entropy is unwise. A fine solution would have been that when the key of HMAC exceeds B bytes, whatever exceeds B-L bytes is replaced by the hash of that, making the key exactly B bytes. But that's no longer HMAC. $\endgroup$
    – fgrieu
    Commented Jun 10, 2013 at 14:33
4
$\begingroup$

The theoretical problem with using text encoded keys, is that it doesn't necessarily conform with how the keys are assumed to be formatted in security proofs such as this one. If you are using HMAC with an underlying hash function that is still believed to meet the requirements of a cryptographically secure hash function, such as the SHA-2 family of hashes, this shouldn't be a problem. The collision resistance assumption of those hashes should be enough to make your alternative scheme no less secure than conventional HMAC.

However, technically, using keys with special formatting might be a problem e.g. if you are using HMAC-h with a broken underlying hash function h, such as SHA-1 or MD5. Both of these two algorithms are based on simple operations such as bit-wise logical operators, addition modulo $2^{32}$ and shifts and rotates. Considering that both hexadecimal encoding and base64 encoding results in octet strings with the high bit of each octet consistently zero, it wouldn't be completely unrealistic that HMAC-h with $K_{hex}$ or $K_{base64}$ might get broken before (or after) HMAC-h with $K$ selected uniformly from $\{0,1\}^k$.

$\endgroup$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.