First Gen AMD Ryzen Kernel Freezing Bug
Table of Contents
Image by enriquelopezgarre from Pixabay
The Problem #
I have been experiencing a hard freeze on my first gen Ryzen 1700x system which has been very annoying. This is the kind of freeze where I can’t even drop into a virtual terminal or do an Alt-SysRq.
This happens on both my Debian 10 and Ubuntu 19.04 installs running kernel versions 4.19 and 5.0.0 respectively. It would happen almost randomly but prominently when my CPU was idle and not doing much.
I assumed this was a bug in the firmware/bios of my mobo or in the kernel that was causing such a horrible lockup, so I did some digging online. I found this post on the Level1Techs forum that was more or less the same problem that I was facing.
I also found posts on AMD’s website, AMD’s sub reddit, and kernel.org
It seemed that I was not alone in experiencing this bug related to CPU power states.
Possible Fixes / Workarounds #
Fix 1: Disable C-States With Kernel Parameters #
On a distro that uses GRUB, edit
/etc/default/grub go ahead and add the
GRUB_CMDLINE_LINUX="processor.max_cstate=1 idle=nomwait rcu_nocbs=0-n"
“n” is going to be (number of cpus) - 1. If you are unsure how many cores your cpu has, run
After that run
grub-mkconfig -o /boot/grub/grub.cfg
as root and reboot your system and the changes should be in place.
Adding ‘processor.max_cstate=1’ will ensure that your CPU will not go into sleep states which seems to be the cause of the halting on these early Ryzen CPUs. I have had success with this option. There are no guarantees it will work for you – but I recommend trying it.
Adding ‘rcu_nocbs=0-n’ limits the number of CPU cores that the kernel assigns to handle softirqs (software interrupts), but note this will only work if your kernel was compiled with this option – I think this is the case with most distros. Read more about RCU’s here.
Adding ‘idle=nomwait’ will “Disable mwait for CPU C-states” as a possible mitigation for the issue.
Fix 2: Disable C-States With A Python Script #
Github user r4m0n made a handy Python script that will let you check or change the C-States on your CPU. This is a good option if you don’t want to disable C-States in the BIOS or for confirming that CPU C-States are actually disabled.
Fix 3: Disable C-States and Idle Power Settings In BIOS/UEFI #
Other suggested fixes include tweaking settings in the BIOS such as turning off C-States (Power States) for the CPU and idle power settings for the PSU if applicable.
They should be under a menu called “AMD CBS”
System Specs #
For reference, here are the specs of my machine
System: Kernel: 4.19.0-5-amd64 x86_64 Distro: Debian GNU/Linux 10 (buster) Mobo: ROG STRIX B350-F GAMING CPU: AMD Ryzen 7 1700X MEM: 32GB Graphics: AMD Radeon RX 480 Graphics (POLARIS10 DRM 3.27.0 4.19.0-5-amd64 LLVM 7.0.1) v: 4.5 Mesa 18.3.6 Driver: amdgpu
If none of these workarounds help, some users reported success from opting to RMA their CPU’s for newer models.
November 2020 Update #
Looking at recent kernel bug reports, this seems to be an ongoing CPU firmware bug in Linux that is still affecting first gen Ryzen CPUs out there.
I have been running Arch on the same hardware with kernel version 5.9.4 as of writing.
processor.max_cstate=1 with no lockups in a heck of a long
time, but I have seen other’s say that this doesn’t work for them.
I can also confirm that there is no issue with the CPU on Microsoft Windows 10. It seems like a bug somewhere in the Linux’s microcode/firmware for power management in AMD CPUs. AMD has also not given a definitive answer/fix yet which is unfortunate.