5

I am configuring a non-sticky load balanced cluster of HTTPS servers. To enable TLS session resumption when a previous client reconnects to a different server in the cluster I will be configuring shared session ticket keys across all servers in the cluster as per RFC 5077.

In the RFC, section "5.5. Ticket Protection Key Management" recommends that:

The keys should be changed regularly.

My research so far has not revealed any consensus on what "regularly" should be. A few references mention daily but without justification. Further, it appears that ticket keys on standalone (ie non-clustered) servers in popular implementations (eg Apache, nginx) are only rotated on process restart which could be very infrequent.

So, my question is essentially:

  1. Is there a rotation schedule for ticket keys that is considered secure? How is that schedule derived?
  2. If there is no recommended schedule, are there at least other aspects of TLS behaviour that define sensible upper and lower bounds for rotation frequency (eg client session cache times, certificate validity period)?

2 Answers 2

7

RFC 4346 is the source of the "daily" recommendation:

An upper limit of 24 hours is suggested for session ID lifetimes, since an attacker who obtains a master_secret may be able to impersonate the compromised party until the corresponding session ID is retired.

Lacking any better guidance, you're probably fine going with 24 hours. I'm not aware of any research or experience suggesting that shorter times are necessary or appropriate for current traffic.

You can also go a lot lower. Session resumption can be a big deal when a client is hammering you with hundreds of connections within a few minutes. But having to renegotiate once an hour, say, is not onerous to the client or to the server. You save the CPU hit of constantly renegotiating, but there's a huge gap between "every single connection" and an hour - or 30 minutes - or even 5 minutes in some cases.

To specifically answer your points:

  1. 24 hours is the upper limit because it's specified in RFC 4346. (Obviously, "because it's in the RFC" doesn't reflect real-world pressures).
  2. Within the 24 hour window, only your traffic can guide you. How often do the same clients reconnect? Do they take advantage of resumption if available? How long do their sessions last on average?

Certificate validity periods don't come into it (as 24 hours is far below their concern horizon). Client session cache times are interesting, but alas, probably not available to you - which goes back to what I said about seeing how often your clients take advantage of resumption when available.

My personal advice - you can play with periods from 1 to 12 hours and you'll probably find they all give you the performance boost you need with diminishing returns (on resumption use) showing up at some point. Find a way to log resumption and new negotiations and build your case on your own traffics' statistics.

1
  • Thank you for a great, detailed answer. I couldn't agree more regarding the measurement of the actual performance benefits. Unfortunately I find it harder to measure the actual security trade off. :) Commented Aug 20, 2015 at 2:02
5

At CloudFlare, we rotate them every hour.*

With respect to your second question, you need to consider UA support. With long-lived session tickets, e.g., 24 hours, you're far less likely to encounter broken client-side implementations. But, with shorter ones, as we've discovered (w/some terrific assistance from customers and users alike), you start to see operating systems that have incomplete or incorrect RFC 5077 support.

As this bug describes, both IE and .NET balk when their underlying crypto library (SChannel) tries to process the NewSessionTicket message sent by our edge. We've been in close contact with Microsoft for a fix, but sometimes you have to work around situations like this when you aggressively implement new'ish security protocols and procedures.

In summary, go as low as you want, but keep an eye out for client-side issues. If I had to guess, they could be more widespread than just Microsoft.


* The link discusses our Keyless SSL implementation but is valid for our standard HTTPS termination

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .