I have a 2-month old Ubuntu build (described at the bottom of the post) that has started to reboot itself within an hour of powering on. It was stable and gave me no issues until these reboots started about a week ago. I've begun to narrow down the possible causes, but I'm posting here for...
- validation of the assumptions I'm making while troubleshooting
- suggestions for the most promising next steps
This post is split into a description of the reboots and a set of questions based off of the testing I've done so far. I've tried to be thorough, but let me know if I can give any more useful info.
Thanks in advance for the assistance!
Nature of the Reboots
The reboots are very sudden - there is no shutdown screen, BSOD or similar, or other notice. There is also never any hanging or freezing beforehand. The system cuts to black out of nowhere and immediately begins to try to reboot itself. It sometimes reboots successfully, and sometimes gets caught in a loop of about 2 seconds of attempting to start up and then a cut. When this happens, I have to manually power it down before it boot successfully.
When it comes back (either immediately or after my intervention), it gives no indication of anything having gone wrong. I've pinpointed the cuts and reboots by timestamp, and see no clues in the kernel log or syslog.
It has occurred in multiple contexts, but the general theme seems to be a higher-than-normal workload (but this is probably a red herring). I first noticed it while playing Steam games and then separately while remotely running scientific python programs in parallel. Since beginning to troubleshoot, it has happened while running a graphics stress test (GpuTest), a processor stress test (mprime), and during MemTest86+ trials, and continues to occur in my normal gaming and SciPy usage.
Are these conclusions/assumptions correct?
(ordered from least to most certain)
- The PSU is not causing the problem because the system attempts to reboot despite my BIOS 'power back' setting being set to 'remain off'. Further, this is a high-quality, brand new PSU with plenty of wattage for the system's components.
- This is a hardware, not software issue because a) there are no clues in the logs and b) it happens during MemTest86+ as well as during regular Ubuntu usage.
- It is likely not the RAM because the problem has been observed in every combination of individual memory module and mother board memory channel. If it were the RAM, I would have two defective sticks. Additionally, when the system fails during MemTest86+, it shows no errors or problems prior to the sudden reboot.
- It is extremely unlikely to be caused by the CPU, and I have updated my BIOS firmware to account for the known Skylake bug.
- Temperatures are not the issue. I have monitored CPU temp and it is normal before the reboots. Additionally, the whole system remains cool to the touch during normal usage and just before the reboots.
- The CMOS battery is fine because the BIOS shows an accurate date and time.
- My hard drives should be fine. The SSD and WD Blue are new, and the problem persists when I remove the older 2.5" HDD.
- My video card is not the culprit because the problem occurs with or without the video card in the system.
- It is not a power outlet issue since a) the system was stable for over a month in the same place (with no new devices plugged into the same circuit) and b) the problem occurs plugged into various circuits around my apartment.
Next Steps
If the above are safe assumptions and conclusions, my next steps will be to rule out the memory modules by borrowing my friend's working DDR4 memory and reproducing the problem in my system, and if necessary by putting my RAM in his system and seeing how it goes.
- Are there other things I should try, or other environments in which I should try to recreate the problem?
- If these tests point to the motherboard, what will I need to do to get Gigabyte to replace the board? It is still under warranty.
System Components
Everything is currently set to the BIOS's optimized defaults.
- CPU: Intel Core i5-6600
- Motherboard: Gigabyte GA-Z170XP-SLI ATX LGA1151
- Memory: G.Skill Ripjaws V Series 16GB (2 x 8GB) DDR4-2400
- Storage: 1 SSD, 1 WD Blue, 1 older 2.5" HDD
- Video Card: EVGA GeForce GTX 750 Ti 2GB SC
- Power Supply: EVGA SuperNOVA G2 550W 80+ Gold (The system's top wattage, according to PCPP, should be somewhere around 260W.)
- OS: Ubuntu 15.10