I have a small Ubuntu server running at home, with 2 hard drives. There are two software raids (raid1) on the disks, managed by mdadm, which I believe is irrelevant, but mentioning it anyway.

Both of the hard drives are Western Digital, and have been used for around 2 years, when one of them started making clicking noises, and died. I figured that maybe it's natural after 2 years, so I bought a new one, and resynced the raid arrays. After about a month, the other drive also died.

I didn't get suspicious, since both drives have been bought at the same time, it's not that surprising to see both of them near each other, so I bought another one.

So far, 2 old drives failed, and 2 brand new in the system. After one month, one of the new drives died. This is when it started getting suspicious. Since the PC was put together from some really old parts (think AthlonXP), I figured that maybe the motherboard's SATA controller is the culprit. Of course you can't switch parts easily in an old PC like this, so I bought a whole system, new MB, new CPU, new RAM. Took the just failed drive back, since it was under warranty, and got it replaced.

So it is up to 2 failed drives from the old ones, and 1 failed drive from the new ones. No problems, for 1 month. After that errors were creeping up again in /var/log/messages, and mdadm was reporting raid array failures. I started tearing my hair out. Everything is new in the system, it's up to the third brand new hard drive, it's simply not possible that all of the new drives that I bought were faulty.

Let's see what is still common... the cables. Okay, long shot, let's replace the SATA cables. Take hard drive back, smile to the guy at the counter and say that I'm really unlucky. He replaces the hard drive. I come home, one month passes and one of hard drives fails, again. I'm not joking.

Two of the brand new hard drives have failed. Maybe it's a bug in the OS. Let's see what the manufacturer's testing tool says. Download testing tool, burn it to a CD, reboot, leave hard drive testing overnight. Test says that the drive is faulty, and I should back up everything, if I still can. I don't know what's happening, but it does not look like a software problem, something is definitely thrashing the hard drives.

I should mention now, that the whole system is in a shoebox. Since there are a load of "build your own ikea case" stuff, I thought there shouldn't be any problems throwing the thing in a box, and stuffing it away somewhere. The box is well ventilated, but I thought that just maybe the drives were overheating. There is no other possible answer to this. So I took the hard drive back, and got it replaced (for the 3rd time), and bought hard drive coolers.

And just now, I have heard the sound of doom. click click whizzzzzzzzz. SSH into the box:

You have new mail!
r 1
DegradedArrayEvent on /dev/md0 ...

dmesg output:

[47128.000051] ata3: lost interrupt (Status 0x50)
[47128.000097] end_request: I/O error, dev sda, sector 58588863
[47128.000134] md: super_written gets error=-5, uptodate=0
[48043.976054] ata3: lost interrupt (Status 0x50)
[48043.976086] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[48043.976132] ata3.00: cmd c8/00:18:bf:40:52/00:00:00:00:00/e1 tag 0 dma 12288 in
[48043.976135] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[48043.976208] ata3.00: status: { DRDY }
[48043.976241] ata3: soft resetting link
[48044.148446] ata3.00: configured for UDMA/133
[48044.148457] ata3.00: device reported invalid CHS sector 0
[48044.148477] ata3: EH complete


  1. No possibility of overheating
  2. 6 drives have failed, 4 of those have been brand new. I'm not sure now that the original two have been faulty, or suffered the same thing that the new ones.
  3. There is nothing common in the system, apart from the OS which is Ubuntu Karmic now (started with Jaunty). New MB, new CPU, new RAM, new SATA cables.
  4. No, the little holes on the hard drive are not covered

I'm crying. Really. I don't have the face to return to the store now, it's not possible for 4 drives to fail under 4 months.

A few ideas that I have been thinking: Is it possible that I mess up something when I partition and resync the drives? Can it be so bad that it physically wrecks the drive? (since the vendor supplied tool says that the drive is damaged) I do the partitioning with fdisk, and use the same block size for the raid1 partitions (I check the exact block sizes with fdisk -lu)

Is it possible that the Linux kernel or mdadm, or something is not compatible with this exact brand of hard drives, and thrashes them?

Is it possible that it may be the shoebox? Try placing it somewhere else? It's under a shelf now, so humidity is not a problem either. Is it possible that a normal PC case will solve my problem (I'm going to shoot myself then)? I will get a picture tomorrow.

Am I just simply cursed?

Any help or speculation is greatly appreciated.

Edit: The power strip is guarded against overvoltage.

Edit2: I have moved inbetween these 4 months, so the possibility of the cause being "dirty" electricity in both places, is very low.

Edit3: I have checked the voltages in the BIOS (couldn't borrow a multimeter), and they are all seem correct, the biggest discrepancy is in the 12V, because it's supplying 11.3. Should I be worried about that?

Edit4: I put my desktop PC's PSU into the server. The BIOS reported much more accurate voltage readings, and also it has successfully rebuilt the raid1 array, which took some 3-4 hours, so I feel a little positive now. Will get a new PSU tomorrow to test with that. Also, attaching the picture about the box: (disregard the 3rd drive)

picture of box of doom

  • 8
    why do you hate hard drives so much?! Commented Jan 21, 2010 at 1:14
  • 4
    It's the opposite, they hate me. With passion.
    – K. Norbert
    Commented Jan 21, 2010 at 8:35
  • 5
    WishCow, if drive testing occurred with a flaky power supply then it only reflects drive operation with flaky power. Many times hardware that fails with poor power supplied to it will work fine when supplied with proper power. Frankly, bad power constitutes a HUGE fraction of all hardware problems. My first action when I suspect a bad hardware component is to try a known-good power supply...
    – Richard T
    Commented Jan 21, 2010 at 15:45
  • 3
    A power strip will only protect you from overvoltage; it will not protect you against undervoltage. As indicated by others, a UPS (at least any worth its salt) will 'clean' dirty power because it will run from battery, instead of direct-from-the-outlet power. Commented Jan 21, 2010 at 21:04
  • 3
    Hi WishCow, you ground the components by connecting them all together with any conducting material. Traditionally, people use a "case", but you can use wires. The disk drives have lots of threaded holes for screws - these are perfect. The mother board may be a bit more tricky because it was intended to be grounded through the mounting studs in a case. They make "stand-off" fasteners that have a screw on one end and have threads in the other. You can use one of these, a screw and a nut to attach to one of the board's mounting holes, keeping your wire attachment off the board itself. -cont-
    – Richard T
    Commented Feb 7, 2010 at 17:02

13 Answers 13


Is your power supply old too? Perhaps its under/overpowering the drive which is causing the failure. If you have a multimeter, I would try measuring the voltage that is running in your hard drives and watch it over a period of time. Another culprit may be 'dirty' electricity, so a UPS may be in order so that it will 'clean' the power going into the PSU.

  • The psu! That's as old too yes, will try to get a multimeter. I forgot to mention, but the power strip is guarded against overvoltage, at least it's some special type. Thanks for the suggestion.
    – K. Norbert
    Commented Jan 20, 2010 at 23:01
  • A dodgy power supply could cause failure of electrical components such as hard drives. The PSU was the first thing I thoug of when I read your posting. Commented Jan 22, 2010 at 9:23
  • Going to mark this accepted, until the hdds give up again, and will look into grounding the components. Thanks for the tip!
    – K. Norbert
    Commented Feb 5, 2010 at 20:52
  • WishCow, I hope you realize by now that this is not the correct answer. The problem is / was that you did not provide any ground for the components.
    – Richard T
    Commented Mar 23, 2010 at 5:22
  • 2
    Odds are it's the PSU plus the absence of grounding. The +12V voltage you quote is way low (actually out of ATX spec) and I know from experience how vulnerable HDDs are to low voltage - they produce all kinds of weird errors so that you think your MB, CPU or memory is at fault. For anyone who works with PCs it's actually worth keeping a known-good PSU around just so you can check that a problem isn;t power-related.
    – raw_noob
    Commented May 4, 2010 at 12:15

I agree with others: power.

However, with a twist.

ALL the components need to have a COMMON ground - the chassis is typical, but in your case, who knows! A "drifting ground" would cause this, I'm sure.

You want all the components tied to a single ground AND that ground tied to the grounding from your facility's "power grid" ground. This is IMPORTANT.

BTW, it is possible that all your old hardware is actually still OK! I have found that equipment that was served with a flaky power supply sometimes survives it OK when a proper supply is provided.

I hope this helps.


  • Oh god, I hope the old hardware is not working, since I have thrown it out. The tester tool said that the HDDs are broken. Will try replacing the PSU.
    – K. Norbert
    Commented Jan 21, 2010 at 8:29
  • 6
    I had ground problems running a "caseless" system (all the parts were mounted on plexy and hung on the wall.) The solution was to run a single ground wire from the power supply case to the case of each device and the motherboard's ground.
    – Chris Nava
    Commented Jan 25, 2010 at 6:08

This is an old post and the original question may no longer be relevant to the person asking the question. However, for future reference to people building a budget PC, Power is not an all encompassing issue with disk drives. It is, in my professional opinion as an EMC certified implementation engineer, a misleading answer to blame a power supply as the sole responsible party given that the computer is inside of a card board box.

Hard disks vibrate, and though there is no particular position, vertical, or horizontal, that increases or decreases the longevity of a disk, there is, however, a vibration factor that a hard drive with spindles creates. The drives displayed here are just laying in a card board box. This is an example of budget engineering, and the vibrating drives are sitting on its side, further increasing the resonation on the platter. Though this isn't an answer in itself, improperly mounted hard disks MAY lead to a disk fault because of a vibrating platter disrupting the read and write heads from touching the platter correctly.

Power, cheap power supplies are always bad for computers in general, however, it is unlikely this PSU killed the hard drives and not other more sensitive components on the board. This system is in a cardboard box, so the engineering and power could have led to a more catastrophic failure, but not necessarily his disk fault. It's possible, but not proven in this case.

Heat: heat can destroy a disk, however, if it wasn't hot to the touch at the time of failure, heat is not the culprit. A card board box is not a good feat of engineering for a PC or server. You are better off bolting your parts onto a computer desk or work bench, at least they would be grounded.

Soft RAID and cheap drives. Given the card board box and old parts viewed in the photo, you appear to be using standard desktop drives and a Soft RAID. Desktop drives can be placed onto a RAID controller, however, with the increased I/O on the disk, the chance of a disk fault increases. The disks imaged in this case are not on a hardware RAID controller, but are being grouped together with a software component on the motherboard. This is not ideal for hard drives. This increases the workload on your CPU, and soft RAID's have been known to have errors and kill hard drives prematurely. It is likely that the soft RAID killed these drives above all else.

Prevention for future builds: If you are reading this and seeing this old user scenario via google question or what not:

-ensure that your disks are properly mounted in a stable hard drive chassis. Bolt in your disks with at least 4 hard drive screws, or use a special disk sled that goes with your chassis.

-Ensure that you have adequate air flow in your case, hard disks in a RAID tend to have more I/O on the disk, and will be much hotter than if the physical volume is mounted individually.

-Do not use a cheap power supply. Dirty power is a killer of expensive computer parts. Also ensure that your power supply provides enough wattage to handle the desired work load.

-Use a RAID controller card! Never use the soft RAID on your motherboard. Soft RAID's reduce disk performance and increase the chance of disk failures more so than that of a RAID controller card.

-RAID in general increases chance of disk failure because of the increased I/O across all of your volumes. The larger the pool of disks being joined, the higher the chance of failed drives. If you RAID your drives, always make use of parity drives and hot spares. You may lose your data if you RAID 0 2-3 disks. If you have 3 disks, use RAID 5! 6 disks on RAID 5 (4+1) with a hot spare is ideal if your drives are covered under a warranty. If you can't afford more disks or your disks are out of warranty, don't use RAID.

-Desktop drives are not Enterprise drives. Desktop drives are similar to Enterprise drives, but are not designed to handle huge workloads brought about with RAID controllers. If you buy desktop drives from newegg and RAID them our on your motherboard, you are likely to see at least one drive failure in your first year. The longer you operate your machine on a RAID, the more I/O is being written to disk and the higher the likelihood your volume will have failures. Combine cheap drives with cheap motherboard soft RAID and you will be hurting.

It is likely that this user experienced all of these factors in his shoe box server. Cheap power, bad air flow, old cheap drives not properly mounted in a chassis, and a motherboard soft RAID...this all increases the chances of a disk fault.


I can't imagine how you have good ventilation and cooling in a shoe box? You really should shell out the 50 or 60 bucks for a real computer case?

Power strips only guard against power surges; common problems for electronic equipment is under voltage (brown out) and over voltage (spiking). Also common is EMI noise - we had an unstable computer a while back which turned out to be caused by having a treadmill on the same circuit (I personally verified this beyond doubt). It would kick the modem offline, and cause the system to just freeze up from time to time.

Also, continual exposure to noise and fluctuations in the power supply with eventually damage the PSU, over time, decreasing the quality of power delivered to the electronics.

EDIT: Electric power fluctuations can be isolated to specific circuits. More importantly, high-draw appliances such as microwaves, refrigerators, treadmills, stove and similar can have a significant impact on the power quality on that circuit. And things like fridges also have a continual on/off cycle of operation which by turn browns and spikes power on the line when the motor kicks in and out.

Also, if you are being served by the same power company, they may be having ongoing trouble supplying voltage across the board. Constantly fluctuating between 105V and 125V will have a negative effect on electronics (as I understand it).

  • The box is not covered, and the HDDs have coolers on them. Good ventilation may not be the correct term here, but it's definitely not overheating, I have checked the temperatures with smartmontools. But if the problem is with elecricity, wouldn't the other computers in the household cause some symptoms? Also I'm adding to the question now, that I have moved to a new place inbetween the 4 months, so it's unlikely that there are electricity problems at both places.
    – K. Norbert
    Commented Jan 21, 2010 at 8:25
  • Having moved, you may still have the same appliance on the same circuit as your computer; also your PSU may already be shot, so the damage may already have been done. I think I would start with obtaining an inexpensive power filtering UPS (about $100) and then immediately replacing the PSU (about $60) on the computer. Commented Jan 22, 2010 at 18:53

It really sounds like power problems.

If you do have power surges, many cheap power strips will only work once - and there's usually no indication that they're no longer protecting.

A good UPS might help - some of the higher-end ones actually generate power from the batteries, and are continuously recharging, providing completely isolated power. The only drawback is that they can be noisy.

  • Couldn't it be a problem from the outlet he is "stuffing it away" on? I would tend to first try it somewhere in the house, safe from the volt guzzlers and stripped wires.
    – mtone
    Commented Jan 20, 2010 at 23:46
  • I actually picked up a power conditioning UPS from Costco for 100 bucks; the battery's not big, providing only enough to keep my internet modem and telephone box up and running, but I bought it primarily to condition the power supply to my computer. Commented Jan 21, 2010 at 2:26
  • The box is basically under a shelf, it's not covered, and it's not near any other electrical appliances. (apart from a ps2 which is not even plugged in atm). If it is the electricity, wouldn't it cause some problems in the other PCs too?
    – K. Norbert
    Commented Jan 21, 2010 at 8:27
  • I could be the power coming into the building, it could be something else in the house - proximity to a malfunctioning appliance isn't needed to affect the power. It may be that the power supply is marginal, so it's more affected by the interference than the other computers
    – chris
    Commented Jan 23, 2010 at 18:01

Actually HDD manufacturers do not print the information regarding working positions on their drives, but standing the hard drives on their sides is perfectly ok. The last time I checked that information, the drives could be positioned laying flat or on their sides, and up to a 5 or 10 degree angle from these positions. Laying them upside down or connectors facing up or down aren't legal positions. Connectors facing up or down used to be the best position for transportation around 15 years ago. This is the latest information I have about this.

I am having the same kind of error on a brand new 500GB WD green hard drive, and your SATA cables look just like mine, and I am suspecting them badly.

The grounding issue isn't properly a bad thing, the components should be grounded by correct mounting on a metallic case but not doing that should not be an issue if all connectors and cables are 100% ok.

Of course a bad power supply can do lots of bad things to the entire system, I would test with a new PSU ASAP, preferably with everything mounted on a decent chassis.

Good luck


I agree that bad ground is the likely culprit. However, consider overheating as a possible cause. If the drives are hot to the touch then they are too hot. Put a fan on them.

  • There are fans on the drives.
    – K. Norbert
    Commented Jan 25, 2010 at 8:19

You can check if they've been overheated by looking at the S.M.A.R.T. values. Grounding the case isn't necessary as many hot swap carriers are plastic and not grounded. Grounding through the SATA cable should be sufficient. Having them firmly mounted MAY help with vibration issues. The head does not touch the platter, but rides slightly above and impact on the platter can cause tiny particles to be disrupted which eventually can result in head crashes.


You should probably update your Ubuntu install. A couple of months (years?) ago, a bug was found which causes increased hard drive wear and tear in Ubuntu installs.

Check out this link about this problem/bug: High frequency of load/unload cycles on some hard disks may shorten lifetime

  • From a first glance this seems like a different problem, but will read through it, thank you.
    – K. Norbert
    Commented Jan 21, 2010 at 8:35

Might there be any large speakers, fridges, air conditioners, electric motors or other magnetic sources next to your (completely un-shielded) shoebox?

  • Unfortunately no, nothing.
    – K. Norbert
    Commented Jan 24, 2010 at 18:14

I agree that bad ground may be the cause of your storage tragedy. However, I would also "fix" the hard disk drives more tightly, because vibrations can induce permanent damage.


Check the power splitters that split power for the drive fans. An intermittent connector can cause your drive to lose power at critical moment and crash it. Definitely need a case for solid ground between MB, PSU, and HD.


i think standing the hard drives on their sides might contributed to their failure of working properly because in most cases, hard drives are mounted lying flat in their computer cases.

  • 2
    This is not the cause. Hard drives don't really care about their orientation.
    – Dan D.
    Commented Mar 20, 2012 at 21:02

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .