I've got a problem with my PC which makes the GUI almost unusable, very sloppy until dead.
I've come down in my analysis that it is caused by the cache/buff forcing too much swapping. Is there any way to finetune these settings?
Usecase: simply read or write tons of data from/to any harddrive (not ssd). Let's say using dd
or f3read
/f3write
. After about a minute the cache or buff gets so large that linux starts swapping heavily.
In this atop
snippet you see this in the PAG row.
MEM | tot 15.5G | free 3.5G | cache 7.8G | buff 96.1M | slab 394.5M | vmbal 0.0M | hptot 0.0M |
SWP | tot 1.0G | free 634.7M | | | | vmcom 8.5G | vmlim 8.8G |
PAG | scan 156637 | steal 156616 | stall 0 | | | swin 0 | swout 11814 |
PSI | cs 0/0/2 | ms 5/2/2 | mf 5/2/1 | is 50/24/15 | if 50/24/15 | | |
DSK | sdb | busy 56% | read 61 | write 1312 | MBr/s 0.0 | MBw/s 147.8 | avio 3.95 ms |
DSK | sda | busy 24% | read 100 | write 11803 | MBr/s 0.2 | MBw/s 4.6 | avio 0.20 ms |
I don't fully understand the meaning of the fields. But I tried the same on my laptop. And everything is similar except the SWOUT
stat is far lower and the system does not suffer.
Ubuntu 19.10 Kernel 5.3.0-19-generic on both computers.
Swap is on SSD. according to atop
SSD busy is between 20 and 50% from swapping mostly.
I already tried setting /proc/sys/vm/swappiness
from 60 to 10 which does not help.
And I set vfs_cache_pressure
from 100 to 50 but this did not help either.
Could it be that the cause lays somewhere else? I did have problems with SATA which should be solved now. And I had a GPU HANG
once (on intel) which I believe has been caused by the swapping problem...
When I started to see this problem (before I did a thorough analysis) I added swap (did not have any before) because kswapd
always went amok. Adding swap at least prevents kswapd from drawing 100% cpu.
any idea?
dd if=/dev/sda of=/dev/null bs=512k
, let it fill 1/2 of ram with cache, and then restarted the command, I got an unusable GUI. I think because cache that is read a second time is moved from the "inactive" list, to the "active" list, and then the kernel starts trimming the active list. So my Q: do you have the same problem if you make sure to usedrop_caches
before you start thedd
command? https://unix.stackexchange.com/questions/518868/during-disk-read-tests-gui-becomes-unresponsive-for-10s-of-seconds-this-includ – sourcejedi Nov 01 '19 at 18:08dd
, there is an option to avoid the cache -iflag=direct
/oflag=direct
. – sourcejedi Nov 01 '19 at 18:10iflag=direct
actually helps. but this is a workaround that cannot be used everywhere. – JPT Nov 01 '19 at 18:50drop_caches
; it was not to avoid filling the cache. – sourcejedi Nov 01 '19 at 19:48