Skip to main content
Punctuation: the contraction of "it is" requires an apostrophe.
Source Link
sawdust
  • 17.9k
  • 2
  • 37
  • 49

TLDR: ItsIt's impossible to put a number on average hard drive life, cause its'cause it's too darned complex.

There's no real measure of average life since it deeply depends on a whole load of different factors. ItsIt's a little like asking how long is a piece of string. For a specific drive, a datasheet may have some relevant information, though itsit's still a rough indicate, that may need to be interpreted with a pinch of salt and tea leaves.

To start with, a single drive failure when you have one drive is a tragedy having one drive of a raided array that's part of a cluster of arrays is a statistic.One cannot look at a specific drive and say "this will certainly last a decade". One can say "This drive ought to last 5 years" and plan to replace it in a planned manner.

I'd also note that backblaze and google, and most of the industry are concerned with average failure rates and reliability over the lifespan of a drive under specific conditions. They want to buy a truckload of drives, run them as cheaply and efficiently as possible, and not really worry about them until planned replacement. Its It's even better to know "these are the signs a drive will die" than having them die, and being able to balance the costs of cooling a place with hardware costs from toasty hard drives frying.

Practically speaking, hard drives are commodity devices - and typically most places don't actually keep track of reliability. ItsIt's only recently (relatively!) that large companies started deploying gigantic fleets of these drives and started sharing their reliability information.

There's a good reason there's a focus on predictive failure analysis and picking models for reliability over long term reliability. Simply all hardware dies and itsit's 'cheaper' in terms of manpower, downtime, and even in some cases accounting to replace drives before they tend to die of mechanical failure.

Specific drives may have issues - the seagate 7200.11 was known for randomly dying due to bad firmware for example and was fixed later. Other drive brands and models may have ridiculous levels of reliability. I've literally never had a HGST desktop drive fail, ever.

You could look up the mean time to failure for the model - which should correlate to the average life of the drive, but modern literature seems to consider it a load of horse hockey. Seagate's switched to AFR anyway.

While looking this up - I came across this great set of slides by someone from WD. Not sure whether the associated lecture is anywhere online.

There's an excellent indication what's the minimum reliability/lifetime that a major hard drive maker expects.

Avoid an un-manageable catastrophe midway (or beyond) through a product’s warranty life

The typical warranty for an enterprise device, and older consumer hard drives is 5 years. ItsIt's 3 years for newer drives. So, your hard drive maker assumes that their drives will not fail before 5 years cause it'll cost them money. As such, they assume you'd either assume the risk, or replace it after the time.

The rest of the presentation is a good read but skipping through most of the physics.

This is a simple little graphic showing all the elements involved in hard drive reliability, taken from the same set of slides

enter image description here

And while the classic bathub curve is what people talk about with drive reliability, things like the actual duty cycle, when writes happen to a drive, and temperature matter, in addition to all these design and environmental factors. ItsIt's just too complex to guess.

TLDR: Its impossible to put a number on average hard drive life, cause its too darned complex.

There's no real measure of average life since it deeply depends on a whole load of different factors. Its a little like asking how long is a piece of string. For a specific drive, a datasheet may have some relevant information, though its still a rough indicate, that may need to be interpreted with a pinch of salt and tea leaves.

To start with, a single drive failure when you have one drive is a tragedy having one drive of a raided array that's part of a cluster of arrays is a statistic.One cannot look at a specific drive and say "this will certainly last a decade". One can say "This drive ought to last 5 years" and plan to replace it in a planned manner.

I'd also note that backblaze and google, and most of the industry are concerned with average failure rates and reliability over the lifespan of a drive under specific conditions. They want to buy a truckload of drives, run them as cheaply and efficiently as possible, and not really worry about them until planned replacement. Its even better to know "these are the signs a drive will die" than having them die, and being able to balance the costs of cooling a place with hardware costs from toasty hard drives frying.

Practically speaking, hard drives are commodity devices - and typically most places don't actually keep track of reliability. Its only recently (relatively!) that large companies started deploying gigantic fleets of these drives and started sharing their reliability information.

There's a good reason there's a focus on predictive failure analysis and picking models for reliability over long term reliability. Simply all hardware dies and its 'cheaper' in terms of manpower, downtime, and even in some cases accounting to replace drives before they tend to die of mechanical failure.

Specific drives may have issues - the seagate 7200.11 was known for randomly dying due to bad firmware for example and was fixed later. Other drive brands and models may have ridiculous levels of reliability. I've literally never had a HGST desktop drive fail, ever.

You could look up the mean time to failure for the model - which should correlate to the average life of the drive, but modern literature seems to consider it a load of horse hockey. Seagate's switched to AFR anyway.

While looking this up - I came across this great set of slides by someone from WD. Not sure whether the associated lecture is anywhere online.

There's an excellent indication what's the minimum reliability/lifetime that a major hard drive maker expects.

Avoid an un-manageable catastrophe midway (or beyond) through a product’s warranty life

The typical warranty for an enterprise device, and older consumer hard drives is 5 years. Its 3 years for newer drives. So, your hard drive maker assumes that their drives will not fail before 5 years cause it'll cost them money. As such, they assume you'd either assume the risk, or replace it after the time.

The rest of the presentation is a good read but skipping through most of the physics.

This is a simple little graphic showing all the elements involved in hard drive reliability, taken from the same set of slides

enter image description here

And while the classic bathub curve is what people talk about with drive reliability, things like the actual duty cycle, when writes happen to a drive, and temperature matter, in addition to all these design and environmental factors. Its just too complex to guess.

TLDR: It's impossible to put a number on average hard drive life, 'cause it's too darned complex.

There's no real measure of average life since it deeply depends on a whole load of different factors. It's a little like asking how long is a piece of string. For a specific drive, a datasheet may have some relevant information, though it's still a rough indicate, that may need to be interpreted with a pinch of salt and tea leaves.

To start with, a single drive failure when you have one drive is a tragedy having one drive of a raided array that's part of a cluster of arrays is a statistic.One cannot look at a specific drive and say "this will certainly last a decade". One can say "This drive ought to last 5 years" and plan to replace it in a planned manner.

I'd also note that backblaze and google, and most of the industry are concerned with average failure rates and reliability over the lifespan of a drive under specific conditions. They want to buy a truckload of drives, run them as cheaply and efficiently as possible, and not really worry about them until planned replacement. It's even better to know "these are the signs a drive will die" than having them die, and being able to balance the costs of cooling a place with hardware costs from toasty hard drives frying.

Practically speaking, hard drives are commodity devices - and typically most places don't actually keep track of reliability. It's only recently (relatively!) that large companies started deploying gigantic fleets of these drives and started sharing their reliability information.

There's a good reason there's a focus on predictive failure analysis and picking models for reliability over long term reliability. Simply all hardware dies and it's 'cheaper' in terms of manpower, downtime, and even in some cases accounting to replace drives before they tend to die of mechanical failure.

Specific drives may have issues - the seagate 7200.11 was known for randomly dying due to bad firmware for example and was fixed later. Other drive brands and models may have ridiculous levels of reliability. I've literally never had a HGST desktop drive fail, ever.

You could look up the mean time to failure for the model - which should correlate to the average life of the drive, but modern literature seems to consider it a load of horse hockey. Seagate's switched to AFR anyway.

While looking this up - I came across this great set of slides by someone from WD. Not sure whether the associated lecture is anywhere online.

There's an excellent indication what's the minimum reliability/lifetime that a major hard drive maker expects.

Avoid an un-manageable catastrophe midway (or beyond) through a product’s warranty life

The typical warranty for an enterprise device, and older consumer hard drives is 5 years. It's 3 years for newer drives. So, your hard drive maker assumes that their drives will not fail before 5 years cause it'll cost them money. As such, they assume you'd either assume the risk, or replace it after the time.

The rest of the presentation is a good read but skipping through most of the physics.

This is a simple little graphic showing all the elements involved in hard drive reliability, taken from the same set of slides

enter image description here

And while the classic bathub curve is what people talk about with drive reliability, things like the actual duty cycle, when writes happen to a drive, and temperature matter, in addition to all these design and environmental factors. It's just too complex to guess.

Source Link
Journeyman Geek
  • 130.3k
  • 52
  • 272
  • 443

TLDR: Its impossible to put a number on average hard drive life, cause its too darned complex.

There's no real measure of average life since it deeply depends on a whole load of different factors. Its a little like asking how long is a piece of string. For a specific drive, a datasheet may have some relevant information, though its still a rough indicate, that may need to be interpreted with a pinch of salt and tea leaves.

To start with, a single drive failure when you have one drive is a tragedy having one drive of a raided array that's part of a cluster of arrays is a statistic.One cannot look at a specific drive and say "this will certainly last a decade". One can say "This drive ought to last 5 years" and plan to replace it in a planned manner.

I'd also note that backblaze and google, and most of the industry are concerned with average failure rates and reliability over the lifespan of a drive under specific conditions. They want to buy a truckload of drives, run them as cheaply and efficiently as possible, and not really worry about them until planned replacement. Its even better to know "these are the signs a drive will die" than having them die, and being able to balance the costs of cooling a place with hardware costs from toasty hard drives frying.

Practically speaking, hard drives are commodity devices - and typically most places don't actually keep track of reliability. Its only recently (relatively!) that large companies started deploying gigantic fleets of these drives and started sharing their reliability information.

There's a good reason there's a focus on predictive failure analysis and picking models for reliability over long term reliability. Simply all hardware dies and its 'cheaper' in terms of manpower, downtime, and even in some cases accounting to replace drives before they tend to die of mechanical failure.

Specific drives may have issues - the seagate 7200.11 was known for randomly dying due to bad firmware for example and was fixed later. Other drive brands and models may have ridiculous levels of reliability. I've literally never had a HGST desktop drive fail, ever.

You could look up the mean time to failure for the model - which should correlate to the average life of the drive, but modern literature seems to consider it a load of horse hockey. Seagate's switched to AFR anyway.

While looking this up - I came across this great set of slides by someone from WD. Not sure whether the associated lecture is anywhere online.

There's an excellent indication what's the minimum reliability/lifetime that a major hard drive maker expects.

Avoid an un-manageable catastrophe midway (or beyond) through a product’s warranty life

The typical warranty for an enterprise device, and older consumer hard drives is 5 years. Its 3 years for newer drives. So, your hard drive maker assumes that their drives will not fail before 5 years cause it'll cost them money. As such, they assume you'd either assume the risk, or replace it after the time.

The rest of the presentation is a good read but skipping through most of the physics.

This is a simple little graphic showing all the elements involved in hard drive reliability, taken from the same set of slides

enter image description here

And while the classic bathub curve is what people talk about with drive reliability, things like the actual duty cycle, when writes happen to a drive, and temperature matter, in addition to all these design and environmental factors. Its just too complex to guess.