2

My PC has been constantly crashing with disk i/o errors recently and I'm trying to diagnose what the cause is before I order any new parts.

I have windows 7 installed on a ssd (with truecrypt) and I started getting BSODs related to the disk. I can boot into windows but it crashes within a few minutes, sooner if I try to do things that writes to the disk such as installing a program.

Thinking my disk had died (possibly due to ssd+truecrypt), I've borrowed one from a friend but I'm unable to install windows or ubuntu on this disk due to similar issues (disk i/o errors). (I disconnected all other drives etc whilst doing this.)

Whilst installing windows it says it's can't verify the files during the expanding files stage.

Whilst installing ubuntu I get errors like " "the installer encountered an error copying files to the hard disk. errno 5 input/output error""

Live ubuntu seems to be working without any issues. Although if I try to install ubuntu to the disk this way, it says it can't verify a file (different each time) stops the installation. Whilst running cat /dev/urandom > /dev/null, everything is fine however cat /dev/urandom > /dev/sda causes ubuntu to hang (no error messages) within a couple of minutes.

I have tried each ram stick on its own whilst booting into windows and this seems to cause it to crash more often.

I have tried resetting the CMOS, using the different sata ports (8 of them) and sata cables, none of this made any difference.

My motherboard has 2 different disk controllers, an intel one and a marvell one and they both seem to have this issue.

I have heard about issues with my motherboard chipset (p67) and the 3gbps sata ports but I have issues with the 6gbps too so I don't think it's a related problem.

My PC specs are:
MSI P67A-GD65 Intel P67 (REV B3)
Intel 2500k
8gb mushkin ram
BeQuiet! 650W psu
Samsung 840 ssd (mine)
OCZ ssd (friend's)

Does this sound like a broken motherboard? What more can I do to diagnose the issue? Also why would removing one stick of ram at a time cause more frequent crashes?

Edit: Thank for the comments. I did forget to mention I ran the windows memtest which passed. I've downloaded memtest86 now and it's currently running, I'll update again once that's finished. Also, I left the pc running cat /dev/urandom > /dev/null for a few hours and nothing happened. Switched to cat /dev/urandom > /dev/sda and the whole thing locked up in less than 5 mins.

Edit: Even though windows memory test said there were no errors, memtest86 found ~200k on one ram module but 0 on the other. I've removed the faulty module and installed ubuntu and then windows on my friend's ssd and it seems to be working so far. I still can't boot into my own ssd though, it just bsods. But I think the original bsoding must have cause write errors on the ssd which has caused this. Hopefully I can mount my ssd and recover the data. I think my raid config may have taken some damage during this whole process too so hopefully that recovers ok. Is there anything I should be aware of when recovering these disks?

I'll select an answer once I'm sure the faulty ram module was the only problem.

Edit: Yep, seems like the ram was the only problem and that caused the I/O issues. Thanks for the help!

1
  • 1
    @Moab he said he tried different SATA cables. Nicholas, run memtest86+ first to rule that out. Also, make sure your processor isn't overheating and the heatsink/fan is installed properly.
    – Bigbio2002
    Commented Jul 7, 2015 at 13:49

1 Answer 1

1

You need to explicitly run a memory test first and foremost, either use the Windows built-in memory tester or, ideally, memtest86+ as suggested by Bigbio2002.

This sounds like a memory issue from start to finish and other than CPU and MB, that's also the only thing you haven't ruled out. Continuing to use your system with faulty memory will likely result in worsening corruption of the data already on your disk(s).

If you do rule out the memory, then the remaining components are the CPU and MB, and it's quite possible the known issue with Intel 6-series SATA ports are contributing - you may be just coincidentally getting issues with the 6Gbps for other reasons.

4
  • Pretty much what I was going to advise - but I would strongly recommend memtest86+ - write to a CD, then boot from the disk and run.
    – CJM
    Commented Jul 7, 2015 at 14:41
  • 12 minutes into running memtest86 so far and it says "Errors: 168921". Is this more likely to be caused by the memory itself or the memory modules/caches on the motherboard? Why would faulty ram cause only disk related problems?
    – nmpolo
    Commented Jul 7, 2015 at 16:59
  • @NicholasMasters: Far more likely the memory because modern motherboards do not have memory caches and CPU caches all have parity and/or ECC. The reason it might only cause disk related problems is possibly due to only part of the memory being faulty, and that part being used for DMA by the disk driver. Kernel and driver ASLR is relatively limited so more likely to use the same addresses over and over, and memory errors tend to be localised
    – qasdfdsaq
    Commented Jul 20, 2015 at 11:36
  • Cheers for the info! You're absolutely right about it only being part of the memory that was faulty, memtest86+ showed that one of the memory addresses was consistently wrong and the rest all correct.
    – nmpolo
    Commented Jul 21, 2015 at 11:57

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .