10

I'm running Linux Mint, version 19 Tara.

My battery life is really bad right now and my fan is always on because my computer is constantly at 70% CPU usage on this kworker thread. It's really starting to annoy me. I run top as soon as I boot up and before I even open a single program (other than the terminal), this process is already taking up 70% CPU.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    5 root      20   0       0      0      0 I  66.1   0.0   1:27.86 kworker/0:0-kac

when I run htop it identifies the kworker thread as going back and forth between being called kacpi_notify and kacipid.

I tried editing grub to acpi=off but then my system boots to a black screen with a blinking underscore and that's it. Won't boot.

I upgraded my kernel, so I'm now running 5.3.0-51-generic. My research so far makes me think I might need to update my BIOS, but my computer manufacturer only provides a BIOS update in .exe form. I've downloaded the exe, but I don't know where to go from here.

Can anybody please help me?

Chris Davies
  • 116,213
  • 16
  • 160
  • 287

4 Answers4

10

The information provided by others is largely correct, but the solutions are not. The tricky bit will be finding the rogue interrupt, as it will vary depending the hardware you have, and may also change whenever you update your firmware/bios/etc. Assuming your running Linux, you can find it by running:

grep -Ev "^[ ]*0" /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1

The above snippet will look at all the interrupts, sort them numerically, and show you the one with the highest count. On my system the output looked like so:

/sys/firmware/acpi/interrupts/gpe69: 7802639     STS enabled      unmasked

You can increase the number of interrupts shown by adjusting the value passed to the head command, but given the CPU usage involved, it will almost always be the first line (after sorting), outside of a few very rare scenarios. Assuming you have identified the offending interrupt, you proceed to disable it like so:

echo disable > /sys/firmware/acpi/interrupts/gpe69

If you want to automate this process you can simply create a script with the following:

echo disable > $(grep -Ev '^[ ]*0'  /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1 | awk -F: '{print $1}')

The above will version will require you to run it as root. To avoid that, you could use:

sudo sh -c "echo disable > $(grep -Ev '^[ ]*0'  /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1 | awk -F: '{print $1}')"

Please note, that if the interrupt with the highest value has already been disabled, you will get an error. This can might happen if you run the snippet above more than once. If this happens, you will see an error that looks like the following:

sh: line 1: echo: write error: Invalid argument

It is easy to adjust the grep expression to filter out interrupts that are already disabled. DO NOT DO THIS. if you do, then running the command above will could disable something important.

Finally, it might be wise to confirm this is your issue before disabling any of the interrupts. You can use the one liner below to confirm to the problem. It will print the counter value every second. If this is your problem you will see the value rise rapidly:

while true ; do grep -Ev '^[ ]*0' /sys/firmware/acpi/interrupts/gpe?? | sort --field-separator=: --key=2 --numeric --reverse | head -1 ; sleep 1 ; done

On my system this looked like so:

/sys/firmware/acpi/interrupts/gpe69: 7921836     STS enabled      unmasked
/sys/firmware/acpi/interrupts/gpe69: 7925137     STS enabled      unmasked
/sys/firmware/acpi/interrupts/gpe69: 7928459  EN STS enabled      unmasked
/sys/firmware/acpi/interrupts/gpe69: 7931766     STS enabled      unmasked
/sys/firmware/acpi/interrupts/gpe69: 7935122     STS enabled      unmasked
ladar
  • 226
5

I've been researching on this problem also. I've tried changing the BIOS settings and all kinds of tweaks. I finally came across this link (https://forum.manjaro.org/t/kworker-kacpid-cpu-100/131532) and it worked for a while. As I have been switching between Ubuntu, Mint and Win10, once the problem happens, it becomes consistent even when I switch/boot into all the OS's. Once I applied the above fix while in Ubuntu 20 then it goes away on every OS I boot into.

Well the problem came back today while I booted up with Mint 19.3. I figure that since the problem came from the interrupt handling in the ACPI area, how can I trigger an ACPI event in hope to "reset" the problem? I decided to try putting the machine to "Suspend" mode, wait for it to complete, then hit the mouse/keyboard to wake it up to see it it'll correct or re-initialize the ACPI handling. Bingo! When it wakes up, the CPU usage drops right back down to the less than 5% range.

This is not just a Linux issue, but when it happens, it happens when I boot into Windoz also. It also does not seem to be a manufacturer specific issue either. This might be a basic PC architecture/design issue. I suspect it may be the ACPI init routine that caused the CPU spike. There might be timing issues in setting up the ISR in handling the ACPI interrupts, so when the interrupts do occur, there's no handling or resetting of the INT, hence causing the INT keeps occurring. Hope this info may give the developers some new ideas to put in a fix for the problem.

I have not tested it long enough to say this works all the time, but it's worth trying.

Best regards, Jim C

My setup: HP Z220, i5-3470, 16G DDR3, nVidia Quadro K1200. Adata 960G SSD + WD 160G ATA HD, APC UPS connected to USB port, nVidia Quadro K-1200, IBM Model M keyboard (1989) and HP optical mouse on PS/2 input. Not the greatest, not for gaming, but an old reliable. ;-)

Jim C
  • 74
  • 1
  • bless you for this answer, sir. suspending the system did not work, but the advice at the link you posted did: it appears that gpe13 was my problem.

    my computer is very old (bought it in 2012) so I'm assuming it's just an incompatibility with the BIOS and the new version of Mint I'm using.

    – Tica Sloth Jun 19 '20 at 22:57
  • It worked for my Dell Precision 3620 tower, thanks for the answer! – Samuel Li Oct 28 '20 at 18:33
  • Link in post seems to be old, I found a similar link (I don't think it was the original) that was helpful: https://forum.manjaro.org/t/kworker-kacpid-over-70-of-cpu-dual-boot-mac-manjaro/61981 – oliverseal May 02 '21 at 13:37
1

To quickly find which interrupts are making noise:

awk '$4=="unmasked"&&$1>1000{print FILENAME,$0}' /sys/firmware/acpi/interrupts/*

Output will probably look something like this:

/sys/firmware/acpi/interrupts/gpe0F 616214841     STS enabled      unmasked  
/sys/firmware/acpi/interrupts/gpe2C 616214418     STS enabled      unmasked  
/sys/firmware/acpi/interrupts/gpe39 616179116     STS enabled      unmasked

To quickly mask them so you can reclaim some CPU for a better fix:

for F in $(awk '$4=="unmasked"&&$1>1000{print FILENAME}' /sys/firmware/acpi/interrupts/*)
do sudo tee $F <<<mask; done

Phew! Watch those load averages plummet! Now on to the real issues. First and easiest is to ensure your distro provided microcode is being installed. For example in Arch, this is likely the intel-ucode or amd-ucode package, depending on your processor. Ubuntu/Debian package names are intel-microcode and amd64-microcode. look for microcode_ctl and linux-firmware if you are in the CentOS/RHEL family. If you don't know which processor brand you have, grep vendor_id /proc/cpuinfo to find out. If your distro has specific instructions for loading microcode, be sure to follow those too. Without the right microcode loaded, you can expect to have several noisy ACPI interrupts that are not properly handled. Reboot for bootloader configuration changes to take effect.

If you still have an interrupt or two with counters that just keep climbing, there is probably some unsupported or faulty hardware lurking. Look for drivers specific to any unusual hardware you have. Masking on boot may be your best bet until those hardware issues can be sorted out.

Kenny
  • 11
-1

Thanks Jim, great to understand this fault and the solution, in my case:

root@HP-6300:/# echo "disable" > /sys/firmware/acpi/interrupts/gpe08

On another ageing but reliable, HP i5 workhorse. I had been culling Firefox..!

AdminBee
  • 22,803
Rik T
  • 1
  • 2