0

Update 7 - 01/02/2018

I changed the CPU. It was stable for approximately 6 hours and then the usual symptoms.

In this update, I will reiterate the things worth remembering, and what steps I have completed to try to resolve the problem.

Symptoms

The machine locks up and freezes completely, seemingly randomly. This is not just a Windows 10 issue, unless Windows 10 has managed to affect low level hardware. This is due to the fact that I have dual booted with Linux, and that nor a live usb stick with an OS on worked. They all froze.

The system is more stable after leaving it off overnight. It will last approximately 30m to 1h. After you experience the first freeze, it can happen every 20 minutes.

Running memtest86, the problem would actually cause this utility to freeze around the 19-20 minute mark every time. This was around Test 10: Sleeping. Just before hammer test.

I purchased another stick of RAM and ran memtest86. It got further. But 2nd pass, it froze again. No errors ever detected with either stick of RAM.

Suspicions and Potential Root Cause

Every time my computer would freeze, I would notice that my BIOS settings would change. Only the memory profiles. The overclocking would be enabled, and anything relating to voltages would change to 1.2V.

No matter how many times I would save them, they would seemingly corrupt, or revert.

I tried P3.00, P3.30 and P4.50. All versions did the same thing.

When I would load Windows long enough to view A-Tuning utility (I don't overclock by the way, I used it for diagnostic purposes), I would notice that the DRAM voltage wouldn't have a reading. The value was not set.

Therefore I suspect the issue is surrounding memory management, and memory profiles at the lowest level. I am sure there are issues with operating systems and this board/CPU, but this is clearly not one of them unless somehow Windows is always running some processes very early... somehow.

My board is set to be sent to the Netherlands, and further to Taiwan. I don't suspect this will be fixed soon. Though, I am set to receive my third board in two weeks, and this will be my second CPU, and second stick of RAM.

It is worth mentioning that I have removed all other components and peripherals in order to diagnose this. Only the essentials were used. Especially in the case of the live USB crashing, I didn't have any SSD or HDD connected. It would load, and freeze after some usage.

Finally, it is important to note that I cleared CMOS regularly between BIOS flashes in order to definitely determine the corruption of BIOS data after freezing.

Update 6

The new board made little difference. I suspect the CPU must be changed.

Update 5 - 26/01/2018, 15:42

Away for the weekend, I stopped the machine from going to sleep so that I could remote desktop in to it.

This was working fine up until 00:22 according to its online status. I can no longer connect to the machine, and I am unsure of the particular reason until I return home. I worry that this defect may be causing the machine to heat up too much and when I return it will be overheating.

It could be a case of Windows Updates, but usually the machine would restart and reconnect to the network.

Update 4

I have replaced the board and thus far there are no problems. I noticed in the Windows 10 power saving settings that now I have an option that says 'AMD Ryzen Balanced'. I am very sure that this was not there before.

I have not changed the BIOS from the version that it shipped with, and that is version P3.0. Though, I may try this in the future.

I ran a GPU stress test with OCCT and it got to 40 minutes with no problems. Previously it froze at 08:29. That doesn't indicate that was the cause, but loading games would also freeze it more often, despite pulling the GPU and reentering it in to the slot.

Again, this still may not be resolved as the issues previously didn't manifest until around 3 days in to usage.

Update 3 - 12:27, 23/01/2018

I have noticed that when I load a game, it tends to lockup/freeze more. That doesn't mean it is the cause, but it might indicate something.

I decided to run some mining software to try and stress the GPU a bit more, seeing as how OCCT froze around 08:29 minutes in to a test.

I notice that when I terminate the mining software, for some reason the system locks up completely. This might be worth exploring further.

Update 2 - 23:57, 22/01/2018

The previous update steps did not work. I was also using OCCT and it appears to have frozen 08:29 in to a 1 hour GPU test.

Update 1 - 19:10, 22/01/2018

The system is stable since turning it on, after having it powered off all day. I do not know why. I have installed Windows 10 updates via USB, and I am currently downloading some more through the usual Windows 10 method.

  • I will proceed to download AMD chipset driver updates.
  • Surprised the Windows 10 installation did not freeze, as it did last night. I have read previously that this can fix things.
  • Despite these above attempts, I have requested a return of the board. I am unsure what to do if this remains stable. To return the current board, or not. I have not tried any of the other methods I have been suggested yet.

On to download and installation...

Components

Motherboard: ASRock 350m Pro4

Processor: Amd Ryzen 5 1600 w/ stock cooling (not overclocked)

RAM: Vengeance LPX DDR4 2400Mhz 8GB

SSD: Crucial MX300 275GB

Network Card: Gigabyte GC-WB867D-I

PSU: Corsair TXM550M 550W

GPU: EVGA Nvidia 1060 GTX 3GB S Gaming

Describe your problem. List any error messages and symptoms. Be descriptive.

The issue itself is the PC locking/freezing up but with power remaining on. Sometimes the screens will switch off. Sometimes they do not. The mouse and keyboard no longer are responsive in this state. I built this machine five days ago, without issue until 2 days ago. For 3 days, there were no problems.

There is no set time for this, it will happen whether idle, or performing a task. It has happened when attempting to load a live USB stick with an operation system, or when the OS is loaded. But I have not experienced this when in the BIOS, before attempting to load an OS. This is on both Windows 10 and Linux Mint in a dual boot using the GNU Grub boot selection software.

When this happens, I must hard reset the machine.

List anything you've done in attempt to diagnose or fix the problem.

  • At first I thought it was software, or driver conflicts. I uninstalled drivers, and it still remained.

  • I've tried ensuring all my PSU cables are in properly, and no loose seating of components.

  • I have updated BIOS firmware from P3.00 -> P3.40 -> P4.50.

  • I attempted to run memtest86, and for 3 passes, that worked. I restarted the machine and run the test overnight, only for it to freeze on the 8th pass with no errors detected.

  • I have run Windows memtesk, and chkdsk without error.

  • Attempted to run the Linux Mint Live USB but this no longer loads, despite loading a few days ago.

Future plans include plugging in an old HDD, and installing an OS on there, whilst the SSD is unplugged. If this works, then it would indicate there is an issue with the SSD or the way the dual boot is setup for Windows and Linux.

Provide any additional details you wish below.

Memtest lockup image -- no errors

4
  • 1
    It certainly is possible. Back on an old Vista laptop years ago, I had problems with dual-boot; Windows would crash on the "Starting Windows" progress bar animation which if I remember correctly had something to do with the wireless driver.
    – user487867
    Commented Jan 22, 2018 at 12:11
  • @Sonickyle27, interestingly, it wasn't an issue, or not that I noticed. I am unsure if Windows and Linux are fighting each others space and somehow freezing.
    – mpw
    Commented Jan 22, 2018 at 12:13
  • The AMD Ryzen balanced plan is an optimized power plan for Ryzen CPUs that comes with newer chipset drivers. It doesn't depend on the board, it depends on the OS. Windows 10 must have downloaded an update that included newer chipset drivers. Alternatively, you could have downloaded them from AMDs webpage.
    – miravalls
    Commented Jan 28, 2018 at 19:01
  • That is correct @miravalls. I did download them, but previously they were not visible. Not until the new board. Maybe there was a Windows update.
    – mpw
    Commented Jan 29, 2018 at 19:35

2 Answers 2

0

Memtest freezing may indicate error either in: motherboard, CPU or RAM. Note that memtest has to store a little bit of data in memory in order to run, so any of those components may be the culprit. Note also that some HW problems only arise under load or prolonged use (due to heat and or not having enough voltage).

My first approach would be to test with memtest each RAM stick individually.

Have you considered that you are suffering from the Ryzen Bug?

The first batches of Ryzen CPUs had a HW bug that was easily triggered under heavy loads (as in 100% usage in all/most cores), but it could happen randomly depending on the workload and programs. I myself experienced it in my setup, which is very similar to yours. I experienced random crashes, both in Windows 10 (while gaming) and Ubuntu (while working), and memtest never detected errors. After I found out about the bug, I RMA both the motherboard and CPU (vendor suggested it, I was only going to RMA the CPU).

The replacements work pretty well and I haven't had any problems since.

Have you tried running the kill-ryzen from github? If this script crashes or outputs "build failed" you certainly have a bad CPU.

7
  • Hi, Thank you for your response. I only actually have the one stick on 8GB DDR4. I am unsure if I can rule that out now. 7 passes seems sufficient. 10 in total. I will try to run the kill-ryzen script, but as the freezing is very inconsistent, I do not know if I will be able to. If I detect an error, and connot figure out what it is, do you believe RMAing the board and CPU is best? Thank you
    – mpw
    Commented Jan 22, 2018 at 12:50
  • @mpw I've never passed more than 3-4 passes of memtest, experiencing an error after that is very weird AFAIK. However, a memtest crash doesn't mean the memory is bad, it could still be the Ryzen Bug.The RMA will depend on your vendor, I would contact them, explain the random crashes and see what they suggest. Maybe they prefer to send the PC to their technicians for an official report first.
    – miravalls
    Commented Jan 22, 2018 at 13:17
  • @mirvavalls -- okay, thank you. I build this machine from individuals parts, so I am unsure who would get precedent, or take responsibility for test it. Parts came from multiple sources.
    – mpw
    Commented Jan 22, 2018 at 13:25
  • @mpw then try running the script. If it fails, try to RMA the CPU and maybe motherboard, I can't tell you if there is an error there too. HW errors are very hard to debug if you don't have spare parts to swap out and narrow down the sole culprit. Best of luck!
    – miravalls
    Commented Jan 22, 2018 at 13:29
  • I've been in contact with ASRock. They state that the issue is to do with Ryzen CPU's. The kill script didn't appear to even function properly. It got stuck on loop 11, but even indicate any error message. ASRock are aware of the problem are are attempting to replicate it. I potentially fixed it with a new board, but as you can see in the update, I am away for now and it has potentially frozen. I can, however, ping the machine, just not remote desktop to it.
    – mpw
    Commented Jan 27, 2018 at 13:58
0

I have the same motherboard and have had strange freezing issues myself. I thought my issues were different because I could get them to stop but now I'm thinking I may have actually found the solution (at least in linux). If you still have this motherboard please try adding iommu=off in your kernel parameters at startup and get back to me on if the freezing stops. I use my system for gpu passthrough so I explicitly need IOMMU for what I do. So I came across this solution merely by it affecting my workflow. Of course if I'm right then this is only a workaround because this would indicate a defect.

1
  • I regularly used to experience PCI-E issues in the log for Linux when I had the issue. I have replaced the motherboard 4 times and it works now. Finally.
    – mpw
    Commented May 18, 2018 at 16:10

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .