We are debugging a situation where the cached/shared memory increase and increase until the system reach OOM-killer.
We have set shmax and shmall in sysctl.conf but without any visible effect. Do we need to enable something more for shmax/shmall to work? Or can some part of the system go beyond this limit, how hard is it enforced? Can buggy user space application or only bugs in kernel/drivers cause it? The application that we debug use graphics and video decoding. Can drivers go beyond the max limits?
kernel.shmmax = 2147483648
kernel.shmall = 524288
Linux kernel is 5.15.71(from Yocto meta-intel). Our system has 4GB ram and no swap (we tried to enable swap but it did not help with the stability of the system). We use Wayland/weston but not systemd. We set the value in sysctl.conf and reboot for it to take effect. We also confirmed the values with ipcs. We tried to set the shared memory to max 2 GB.
ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 2097152
max total shared memory (kbytes) = 2097152
min seg size (bytes) = 1
Here is some example output from free, meminfo, smem, etc a few minutes before it reaches OOM.
free -w
total used free shared buffers cache available
Mem: 3844036 479428 263444 2711864 11324 3089840 585716
Swap: 0 0 0
cat /proc/meminfo
MemTotal: 3844036 kB
MemFree: 262680 kB
MemAvailable: 584940 kB
Buffers: 11324 kB
Cached: 3055620 kB
SwapCached: 0 kB
Active: 98764 kB
Inactive: 645792 kB
Active(anon): 732 kB
Inactive(anon): 394288 kB
Active(file): 98032 kB
Inactive(file): 251504 kB
Unevictable: 2708620 kB
Mlocked: 100 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 12 kB
Writeback: 0 kB
AnonPages: 386388 kB
Mapped: 162732 kB
Shmem: 2711864 kB
KReclaimable: 34208 kB
Slab: 68656 kB
SReclaimable: 34208 kB
SUnreclaim: 34448 kB
KernelStack: 4640 kB
PageTables: 5904 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1922016 kB
Committed_AS: 4068728 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 15104 kB
VmallocChunk: 0 kB
Percpu: 1040 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 72236 kB
DirectMap2M: 3938304 kB
DirectMap1G: 2097152 kB
smem
PID User Command Swap USS PSS RSS
…..
1306 weston /usr/libexec/wpe-webkit-1.1 0 27192 51419 98928
1379 weston /usr/libexec/wpe-webkit-1.1 0 190268 214958 266040
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 3030848 2938432 92416
userspace memory 555656 162732 392924
free memory 257532 257532 0
Map PIDs AVGPSS PSS
……
/usr/lib/libcrypto.so.3 20 527 10544
/usr/lib/dri/iris_dri.so 5 2196 10982
/usr/lib/dri/iHD_drv_video.so 1 20356 20356
/usr/lib/libWPEWebKit-1.1.so.0.2.6 5 14539 72697
[heap] 45 2060 92700
<anonymous> 45 5970 268688
Edit: Added df info for tmpfs. The tmpfs mounts showed with df does not show any extra ordinary size increase.
/# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 9.8G 1.9G 7.4G 21% /
devtmpfs 1.9G 2.1M 1.9G 1% /dev
tmpfs 1.9G 636K 1.9G 1% /run
tmpfs 751M 5.8M 745M 1% /var/volatile
tmpfs 40K 0 40K 0% /mnt/.psplash
SwapTotal: 0 kB
? If you need reliability, disable memory overcommit, disable the out-of-fuel, err, OOM killer, and provide your system with enough swap so it can run reliably. – Andrew Henle Feb 20 '23 at 16:29