0

This has been reported as a bug to Debian.

I have an i5 with high avg. load as shown in top, which for an idle system (with just sshd and 2 sessions) stays around 2.00 all the time. The machine hosts a fresh Debian 9 installation, and it's not been a perfect pair straight out of the box, as I've already had to deal with a kworker eating 80% of one core all the time, the same issue as described here (with Ubuntu 16.04).

I've installed non-free firmware from Debian:

  • firmware-realtek
  • firmware-iwlwifi

But I have also tested with Debian Live without installing these drivers, and there's no difference.

The whole top header looks so:

top - 13:42:33 up  1:33,  3 users,  load average: 1.83, 2.01, 2.01
Tasks: 230 total,   1 running, 229 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.0 us,  0.3 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.3 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  3955660 total,  2123712 free,   657580 used,  1174368 buff/cache
KiB Swap:  4095996 total,  4095996 free,        0 used.  2888300 avail Mem 

iostat:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.04    0.00    0.08    0.04    0.00   99.83

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               4.00         0.00        20.00          0        120
sdb               0.00         0.00         0.00          0          0
dm-0              5.17         0.00        20.00          0        120
dm-1              3.50         0.00        14.00          0         84
dm-2              1.50         0.00         6.00          0         36
dm-3              0.00         0.00         0.00          0          0
dm-4              0.00         0.00         0.00          0          0
dm-5              0.00         0.00         0.00          0          0

nload shows very low values:

  • incoming avg.: 1.14 kBit/s
  • outgoing avg.: 9.27 kBit/s

All together, the system looks idle, but there's the reported load. The temperatures too seem slightly high, I guess:

$ sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +27.8°C  (crit = +105.0°C)
temp2:        +29.8°C  (crit = +105.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +50.0°C  (high = +84.0°C, crit = +100.0°C)
Core 0:         +47.0°C  (high = +84.0°C, crit = +100.0°C)
Core 1:         +50.0°C  (high = +84.0°C, crit = +100.0°C)

Here are the top processes:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 5017 root      20   0   45052   3860   3200 R   1.0  0.1   0:00.10 top
  165 root      20   0       0      0      0 D   0.3  0.0   0:07.94 kworker/3:3
 1259 tomasz    20   0 1306660  41600  32768 S   0.3  1.1   0:03.08 gnome-settings-
    1 root      20   0  139492   7252   5268 S   0.0  0.2   0:00.90 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd

Following the hint in this answer, here's the list of processes in states D or R:

# ps -e v | perl -nalE 'say $_ if $F[2] =~ /R|D/'
   47 ?        D      0:14      0     0     0     0  0.0 [kworker/3:1]
  165 ?        D      0:14      0     0     0     0  0.0 [kworker/3:3]
  393 ?        D      0:00      0     0     0     0  0.0 [rtsx_usb_ms_1]
 5640 pts/0    R+     0:00      0   106 29757  1564  0.0 ps -e v
 5641 pts/0    R+     0:00      0  1940 15691  3448  0.0 perl -nalE say $_ if $F[2] =~ /R|D/ 

This set of two kworkers and rtsx_usb_ms_1 in state D is always present, after each reboot.

I've been experimenting with different BIOS configs and kernel parameters and no with acpi_osi=Linux the load might have diminished, but only a bit and still sits close to 2.00 avg.

I'm wondering whether I should file this as a bug. Who would be the addressee though? Debian? Kernel?

Machine details:

  • Motherboard: Fujitsu FJNBB35
  • CPU: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
  • RAM: 4G, SODIMM DDR3 Synchronous 1600 MHz (0.6 ns), Samsung M471B5173QH0-YK0
  • OS: 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux

1 Answers1

1

That rtsx_usb_ms_1 process looks like the likely culprit to me. That's for a Realtek memory stick/SD card reader device. You can try blacklisting the driver with something like

echo blacklist rtsx_usb_ms >> /etc/modprobe.d/99-local.conf

...and then rebooting to see if preventing the driver from loading works around the problem. Simply running rmmod rtsx_usb_ms might work too. You'll have to manually load the kernel module or remove the blacklist and reboot to use the reader, though.

This may be a regression, as this patch (https://lkml.org/lkml/2014/11/5/905) was used to fix Debian bug #765717. Perhaps it never made it into the mainline kernel.

If removing/blacklisting the module fixes the problem, I'd file a bug report with Debian.

mulad
  • 139
  • rmmod rtsx_usb_ms didn't help, but I next did rmmod rtsx_usb_sdmmc and that was it. Load 0.00. I'm worried about the temperatures though. Is 47 and 48 normal for an idle CPU? (That's from coretemp-isa-0000.) –  Mar 24 '18 at 02:38
  • Report submitted: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=893935 Thanks! –  Mar 24 '18 at 04:17