5

A machine is running Windows 7 x64 with several other systems running on VMware Workstation. I built the machine myself, using only quality parts and ran memtest86+ several times without error. Everything worked perfectly for months.

Then I decided to change the main hard disk, because it was the only part that wasn't new when I built the computer. Since this hdd is only for the OS (there are other HDDs for data), I don't need much space and I wanted to try something new, so I bought a 120 GB solid state drive: OCZ-VERTEX2 3.5

Some time later, the system crashed for the first time. It froze and AFAIR the screen was showing gray stripes. Again, some time later, VMware crashed - same here, VMware has never crashed on any of my machines before. At some point, I wanted to check the drive status (long before the VMware crash, probably last year) of the SSD, so I ran HDD Health and it told me, the SSD's health is 86%. This hasn't changed until today. Since SSDs are supposed to be more stable, I'm wondering if that information is just unaccurate.

Here's my question:

Are those two crashes just unfortunate incidents, which cause some work, because I tend to reinstall crashed systems, or could there by any connection to the SSD? Is there a way to find that out (SSD stability test or something)?

  • Asus M4A78 Pro
  • 4x 4 GB = 16 GB RAM DDR2
  • AMD Phenom 9650 Quad 2.3 GHz
1

3 Answers 3

1

The only way you could know for sure would be to run an equivalent test like memtest86 but for hard disks. I don't know if anything like this exists, because it would of course wipe all the data that is on the disk.

Reading corrupted data is a known problem with disks though - generally they're pretty reliable, but filesystems like ZFS include checks to make sure any read errors don't go undetected as they do happen.

Have you re-run memtest86 since the problem happened? Two crashes on their own are pretty ordinary, especially running non-trivial things like a VM under Windows. Hardware also does deteriorate over time, and it could have even been something as simple as a brownout that didn't cause your power supply to shut off - assuming you're not connected to a high quality UPS.

I'm also not sure what you mean by SSDs being more 'stable' - they are widely known to have a significantly shorter lifespan than hard disks - people choose them for speed, not reliability. Even Intel said at the last Linux Australia conference they wouldn't use SSDs for mission-critical data just yet (that was a very interesting talk, go watch the video if you weren't there!)

5
  • Interesting idea - I know a tool called "H2testw" that I use to check if a new usb driv is ok (successfully). I've never thought of using it for hard drives, but I will let it check my SSD.
    – basic6
    Commented Apr 21, 2011 at 6:46
  • @basic6: Don't forget that the problem could also be introduced in the cable between the drive and the controller. In this case you would see random errors in different places, however I would expect you would have to do some serious testing to pick it up (thousands of read cycles.) I guess as long as you're not writing too much to the SSD you won't wear it out with the testing though.
    – Malvineous
    Commented Apr 21, 2011 at 6:55
  • I've already taken the SSD out for testing, the cable wasn't loose at all. I'll probably replace it with a new HDD anyway, to be safe... In response to your other questions: Yes, I've run memtest86+ once after the freeze (before VMWare) without error. I'm not connected to a UPS but I use a 500W Tagan power supply in the computer. By stable I meant physically (shock-)resistant (+ against magnetic forces).
    – basic6
    Commented Apr 21, 2011 at 13:40
  • @basic6: Yes, in that manner SSDs are more resilient against physical forces. Also just to be clear I wasn't referring to a loose cable, but electrical noise interfering with signals travelling over the cable (which you can get with very long cables for instance, or if there is a poorly shielded device inside your PC case.)
    – Malvineous
    Commented Apr 23, 2011 at 8:09
  • I'm not using any special cables (all standard length). There are 4 disk drives though, but I somehow don't think that's the reason. Some days ago, I replaced the SSD with a new HDD (WD) and reinstalled the host system. This fresh, clean system froze two days ago (black screen), while I was working with 3 VMs. AFAIR I had this freeze problem long long ago, when the computer had only 8 GB RAM, with different modules (memtest without error though). After I changed them (2x 4 GB, same model), no freezes anymore. The additional 2 modules now are different ones... Maybe I should match them again...
    – basic6
    Commented Apr 29, 2011 at 20:01
1

For future reference:

I'm now convinced that the SSD wasn't the problem, but rather the combination of memory and mainboard. I'd probably experience SSD-related issues sooner or later, if I'd still use it, because of their limited lifespan compared to HDDs, but my SSD wasn't old enough for that to be very likely.

This mainboard (Asus M4A78 Pro) apparently has some issues working with different memory modules. A long time ago, the machine used to freeze sometimes after running for about 1 week straight - with 2x 2 GB and 1x 4 GB memory module. When I replaced the 2 GB modules with another 4 GB module (same type as the other one already in use), it didn't froze anymore. The freezes came back with another 4 GB module (the 2 identical 4 GB modules still in use, but a new, different 4 GB module), but that was the time when I started using the SSD. Since the system froze again after replacing the SSD with a new HDD, I bought another 2 4 GB modules of the same type like before and removed the 2 old 4 GB modules (all of which did not produce any error in memtest86+, running at least 24 hours), so the system is now using 4 identical (slightly different label, but same product number, should be identical) 4 GB modules. Since then (several months have passed) the system is running perfectly fine - haven't had any freezes or crashes.

I bought the modules from crucial.com (nice and informative site, very fast shipping).

0

The Vertex 2 had numerous compatibility problems when it was introduced, and many people had to upgrade their SATA hardware or BIOS in order for it to work correctly.

In theory, you should be fine since that was some time ago, and now they are dealing with the vertex 3 problems (SATA III).

However, it's very likely to be the same issues:

  • Make sure your BIOS is updated
  • Make sure the Vertex 2 firmware is updated
  • Make sure you're using known good SATA cables (if the vertex came with one, use it)
  • Make sure it's plugged into a SATA II or III port on your motherboard (some MB come with a variety of SATA ports, and they aren't always the same)
  • Make sure you're using the latest motherboard/chipset drivers in your various operating systems

Memtest is one part of a good burn in test suite. If you do a search for burn in test software you'll find many packages that will stress test your system repeatedly (and take the OS and software out of the loop, as these don't run on top of them).

A complete test would erase the hard drive, but you can choose a lessor test that only tests unused portions of the drive.

If the stress test shows the hard drive is bad, have it replaced or contact tech support and see what they say about hardware compatibility with your system. It may well be that you have a bad drive, but doing some or all of the above should give you the best chance at getting the drive to work on your system.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .