1

We are debugging a situation where the cached/shared memory increase and increase until the system reach OOM-killer.

We have set shmax and shmall in sysctl.conf but without any visible effect. Do we need to enable something more for shmax/shmall to work? Or can some part of the system go beyond this limit, how hard is it enforced? Can buggy user space application or only bugs in kernel/drivers cause it? The application that we debug use graphics and video decoding. Can drivers go beyond the max limits?

kernel.shmmax = 2147483648
kernel.shmall = 524288

Linux kernel is 5.15.71(from Yocto meta-intel). Our system has 4GB ram and no swap (we tried to enable swap but it did not help with the stability of the system). We use Wayland/weston but not systemd. We set the value in sysctl.conf and reboot for it to take effect. We also confirmed the values with ipcs. We tried to set the shared memory to max 2 GB.

ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 2097152
max total shared memory (kbytes) = 2097152
min seg size (bytes) = 1

Here is some example output from free, meminfo, smem, etc a few minutes before it reaches OOM.

free -w                                                                    
               total        used        free      shared     buffers       cache   available
Mem:         3844036      479428      263444     2711864       11324     3089840      585716
Swap:              0           0           0

cat /proc/meminfo

MemTotal: 3844036 kB MemFree: 262680 kB MemAvailable: 584940 kB Buffers: 11324 kB Cached: 3055620 kB SwapCached: 0 kB Active: 98764 kB Inactive: 645792 kB Active(anon): 732 kB Inactive(anon): 394288 kB Active(file): 98032 kB Inactive(file): 251504 kB Unevictable: 2708620 kB Mlocked: 100 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 12 kB Writeback: 0 kB AnonPages: 386388 kB Mapped: 162732 kB Shmem: 2711864 kB KReclaimable: 34208 kB Slab: 68656 kB SReclaimable: 34208 kB SUnreclaim: 34448 kB KernelStack: 4640 kB PageTables: 5904 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1922016 kB Committed_AS: 4068728 kB VmallocTotal: 34359738367 kB VmallocUsed: 15104 kB VmallocChunk: 0 kB Percpu: 1040 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 72236 kB DirectMap2M: 3938304 kB DirectMap1G: 2097152 kB

smem

PID User Command Swap USS PSS RSS
….. 1306 weston /usr/libexec/wpe-webkit-1.1 0 27192 51419 98928
1379 weston /usr/libexec/wpe-webkit-1.1 0 190268 214958 266040

Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 3030848 2938432 92416
userspace memory 555656 162732 392924
free memory 257532 257532 0

Map PIDs AVGPSS PSS
…… /usr/lib/libcrypto.so.3 20 527 10544
/usr/lib/dri/iris_dri.so 5 2196 10982
/usr/lib/dri/iHD_drv_video.so 1 20356 20356
/usr/lib/libWPEWebKit-1.1.so.0.2.6 5 14539 72697
[heap] 45 2060 92700
<anonymous> 45 5970 268688

Edit: Added df info for tmpfs. The tmpfs mounts showed with df does not show any extra ordinary size increase.

/# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root       9.8G  1.9G  7.4G  21% /
devtmpfs        1.9G  2.1M  1.9G   1% /dev
tmpfs           1.9G  636K  1.9G   1% /run
tmpfs           751M  5.8M  745M   1% /var/volatile
tmpfs            40K     0   40K   0% /mnt/.psplash
GuzZzt
  • 33
  • SwapTotal: 0 kB? If you need reliability, disable memory overcommit, disable the out-of-fuel, err, OOM killer, and provide your system with enough swap so it can run reliably. – Andrew Henle Feb 20 '23 at 16:29
  • I tested to add 16GB of swap, it delayed the crash with a few hours but did not make the system stable. But I have trouble see what is using now close to 20GB of memory...Things seems to be cleared/swapped. Here are some values when it was 2-3GB left of swap: MemTotal: 3844024 kB MemFree: 108624 kB MemAvailable: 173040 kB Cached: 3356196 kB SwapCached: 30340 kB Active: 339312 kB Inactive: 3124336 kB Active(anon): 294284 kB Inactive(anon): 3121236 kB SwapTotal: 16777212 kB SwapFree: 2978812 kB Shmem: 3303908 kB – GuzZzt Feb 21 '23 at 15:20

1 Answers1

0

SHMAX and SHMALL won't constraint the size of your miscellaneous tmpfs.

Since tmpfs lives completely in the page cache and on swap, all tmpfs pages will be shown as “Shmem” in /proc/meminfo and “Shared” in free(1).

Check with the df utility how many of these filesystem are actually mounted on your system and eventually limit their maximum possible size thanks to the size= parameter of corresponding mount operation.

Of course if your application uses this kind of filesystem and no swap space is made available, the application might well block or stop processing because it won't find any space left on device.

MC68020
  • 7,981
  • We looked into tmpfs and could not see any increase in usage. The size is large but usage is always very small. Is tmpfs the only shared/cached memory not affected by shmax/shmall? – GuzZzt Feb 21 '23 at 10:35
  • Yes indeed : GEM buffers. cf accepted answer as part of https://unix.stackexchange.com/questions/482795/can-i-see-the-amount-of-memory-which-is-allocated-as-gem-buffers – MC68020 Feb 21 '23 at 11:48
  • Without swap, we see that gem objects increase until OOM. With swap, just before OOM it clear/swap out. i915_gem_objects in debugfs goes down alot at that point. But then something else eat up all swap at it still reach OOM. I don't see anything about limiting gem in that answer thread. – GuzZzt Feb 21 '23 at 15:36
  • Oh come on ! Your question was : "What shared memory is not controlled by SHMAX/SHMALL". Then I answer : tmpfs and GEM buffers. I just gave the link in order to justify my statement regarding GEM buffers. If you are now looking for a way to limit GEM buffers then it is IMHO another question. – MC68020 Feb 21 '23 at 15:45
  • I think what I wanted to know was a complete (or near complete) list of things that was outside of shamax/shmall control. And if so how to control them. tmpfs seems to be easy to set a max limit on, but gem seems to be able to do what ever it wants... But I will take this as a learning input on writing clear questions and accept this answer. – GuzZzt Feb 22 '23 at 15:13