Last Friday I upgraded my Ubuntu server to 11.10, which now runs with a 3.0.0-12-server kernel. Since then the overall performance has dropped dramatically. Before the upgrade the system load was about 0.3, but currently it is at 22-30 on an 8 core CPU system with 16GB of RAM (10GB free, no swap used).
I was going to blame the BTRFS file system driver and the underlaying MD array, because [md1_raid1] and [btrfs-transacti] consumed a lot of resources. But all the [kworker/*:*] consume a lot more.
sar
has outputted something similar to this constantly since Friday:
11:25:01 CPU %user %nice %system %iowait %steal %idle
11:35:01 all 1,55 0,00 70,98 8,99 0,00 18,48
11:45:01 all 1,51 0,00 68,29 10,67 0,00 19,53
11:55:01 all 1,40 0,00 65,52 13,53 0,00 19,55
12:05:01 all 0,95 0,00 66,23 10,73 0,00 22,10
And iostat
confirms a very poor write rate:
sda 129,26 3059,12 614,31 258226022 51855269
sdb 98,78 24,28 3495,05 2049471 295023077
md1 191,96 202,63 611,95 17104003 51656068
md0 0,01 0,02 0,00 1980 109
The question is: How can I track down why the kworker threads consume so many resources (and which one)? Or better: Is this a known issue with the 3.0 kernel, and can I tweak it with kernel parameters?
Edit:
I updated the Kernel to the brand new version 3.1 as recommended by the BTRFS developers. But unfortunately this didn't change anything.
pcie_ports=compat
orpcie_ports=native
. (Try 'native' first. It's less likely to fix the problem but less likely to cause other problems.) – David Schwartz Oct 18 '11 at 22:02