Skip to main content
Mod Moved Comments To Chat
added 250 characters in body
Source Link

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 / 2 = 200/1000000 = 1/5000. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is twice as much as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Another item to note is that likelihood of drive failure can be increased by utilising SSD's manufactured within the same batch (same factory, same time). If you are not careful, you could end up with all 3 nodes going down because of this issue.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 / 2 = 200/1000000 = 1/5000. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is twice as much as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 / 2 = 200/1000000 = 1/5000. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is twice as much as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Another item to note is that likelihood of drive failure can be increased by utilising SSD's manufactured within the same batch (same factory, same time). If you are not careful, you could end up with all 3 nodes going down because of this issue.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 / 2 = 400200/1000000 = 1/25005000. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is the same chancetwice as much as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 = 400/1000000 = 1/2500. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is the same chance as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 / 2 = 200/1000000 = 1/5000. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is twice as much as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

added 13 characters in body
Source Link

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 = 400/1000000 = 1/2500. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is the same chance as one RAID 5 array withfailing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 = 400/1000000 = 1/2500. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is the same chance as one RAID 5 array with the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Drive failure should be taken into consideration here.

Imagine for a second that our drives on any particular day have a 1/1000 failure rate. Imagine then that we have 20 drives in each of our 3 arrays.

The chance of a single drive failing in an array is therefore 20/1000 = 1/50. The chance of two drives failing within the same array is something close to 20/1000 * 20/1000 = 400/1000000 = 1/2500. So by switching from RAID 0 to RAID 5 we're already significantly less likely to kill one of our arrays.

So we can take this further - if the chance of an array failing on a day is 1/50, then the chance of two arrays failing in a day is 1/(50*50) = 1/2500. The chance of two identical RAID 0 arrays failing is the same chance as one RAID 5 array failing, assuming the same disk set. This exponential increase in the chances of failure should concern you, as it massively increases the chance that more than one array fails at once.

As these disks are likely to have a long life time, you can likely run the numbers as above and directly see what effect this will have on reliability - if you can post the drive specifications I can add that calculation to this post. Whether the risk is then acceptable or not is for your organisation to decide.

Disclaimer: The above calculations have been simplified - they are still relatively accurate.

Source Link
Loading