I have a mystery: what is using 6GB of my swap? My kernel version is 4.15.9-300.fc27.x86_64
.
This happened following some crashes. dmesg
shows I had a segfault in a gnome-shell process (which belonged to gdm) and later some firefox processes (Chrome_~dThread, in libxul.so). coredumpctl -r
shows no other crashes on my current boot.
1. free
and df -t tmpfs
# free -h
total used free shared buff/cache available
Mem: 7.7G 1.2G 290M 5.4G 6.1G 761M
Swap: 7.8G 6.0G 1.8G
# swapoff -a
swapoff: /dev/dm-1: swapoff failed: Cannot allocate memory
# df -h -t tmpfs
Filesystem Size Used Avail Use% Mounted on
tmpfs 3.9G 17M 3.9G 1% /dev/shm
tmpfs 3.9G 1.9M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
tmpfs 3.9G 40K 3.9G 1% /tmp
tmpfs 786M 20K 786M 1% /run/user/1000
I also manually checked the mount namespace of every process, for any extra tmpfs. There was no other mounted tmpfs (or they were the same - so only 17M, and there were less than 10 different mount namespaces).
2. ipcs
# ipcs --human
------ Message Queues --------
key msqid owner perms size messages
------ Shared Memory Segments --------
key shmid owner perms size nattch status
0x00000000 20643840 alan-sysop 600 512K 2 dest
0x00000000 22970369 alan-sysop 600 36K 2 dest
0x00000000 20774914 alan-sysop 600 512K 2 dest
0x00000000 20905987 alan-sysop 600 3.7M 2 dest
0x00000000 23461892 alan-sysop 600 2M 2 dest
0x00000000 20873221 alan-sysop 600 3.7M 2 dest
0x00000000 22511622 alan-sysop 600 2M 2 dest
0x00000000 28278791 alan-sysop 600 60K 2 dest
0x00000000 23003144 alan-sysop 600 36K 2 dest
0x00000000 27394057 alan-sysop 600 60K 2 dest
0x00000000 29622282 alan-sysop 600 156K 2 dest
0x00000000 27426828 alan-sysop 600 60K 2 dest
0x00000000 28246029 alan-sysop 600 60K 2 dest
0x00000000 29655054 alan-sysop 600 156K 2 dest
0x00000000 29687823 alan-sysop 600 512K 2 dest
------ Semaphore Arrays --------
key semid owner perms nsems
0x002fa327 98304 root 600 2
3. Process memory
The per-process swap usage script says process memory only accounts for 54MB of swap:
PID=1 swapped 2292 KB (systemd)
PID=605 swapped 4564 KB (systemd-udevd)
PID=791 swapped 324 KB (auditd)
PID=793 swapped 148 KB (audispd)
PID=797 swapped 232 KB (sedispatch)
PID=816 swapped 120 KB (mcelog)
PID=824 swapped 1544 KB (ModemManager)
PID=826 swapped 152 KB (rngd)
PID=827 swapped 300 KB (avahi-daemon)
PID=829 swapped 1688 KB (abrtd)
PID=830 swapped 836 KB (systemd-logind)
PID=831 swapped 432 KB (dbus-daemon)
PID=843 swapped 368 KB (chronyd)
PID=848 swapped 312 KB (avahi-daemon)
PID=854 swapped 476 KB (gssproxy)
PID=871 swapped 1140 KB (abrt-dump-journ)
PID=872 swapped 1280 KB (abrt-dump-journ)
PID=873 swapped 1236 KB (abrt-dump-journ)
PID=874 swapped 14196 KB (firewalld)
PID=911 swapped 592 KB (mbim-proxy)
PID=926 swapped 1356 KB (NetworkManager)
PID=943 swapped 17936 KB (libvirtd)
PID=953 swapped 200 KB (atd)
PID=955 swapped 560 KB (crond)
PID=1267 swapped 284 KB (dnsmasq)
PID=1268 swapped 316 KB (dnsmasq)
PID=10397 swapped 160 KB (gpg-agent)
PID=14862 swapped 552 KB (systemd-journal)
PID=18131 swapped 28 KB (login)
PID=18145 swapped 384 KB (bash)
Overall swap used: 54008 KB
So far I am assuming that there is no negligent program which used
umount -l
on a full tmpfs. I haven't tried to scrape /proc/*/fd for anyone holding such a hidden tmpfs open.I suppose I am also assuming no-one has constructed a giant
memfd
and is holding it open... haha why would I even suspect such a thing... sob.
The memfd names attached to processes seem innocent to me:
# ls -l /proc/*/fd/* 2>/dev/null|grep /memfd:
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/20889/fd/37 -> /memfd:xshmfence (deleted)
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/20889/fd/53 -> /memfd:xshmfence (deleted)
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/20889/fd/54 -> /memfd:xshmfence (deleted)
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/20889/fd/55 -> /memfd:xshmfence (deleted)
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/20889/fd/57 -> /memfd:xshmfence (deleted)
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/20889/fd/60 -> /memfd:xshmfence (deleted)
lrwx------. 1 alan-sysop alan-sysop 64 Mar 18 22:52 /proc/21004/fd/6 -> /memfd:pulseaudio (deleted)
These memfds seem innocent because: Process 20889 is my current Xorg
, which post-dates the 6GB of swap. Similarly process 21004 is indeed my pulseaudio process, and the creation time on this process is later than the 6GB of swap was built up.
In theory the ones I'm worried about could also be in limbo though, attached to a unix socket message and never read.
EDIT1
After stopping systemd-logind
- which native Xorg responds to by dying - and restarting Xorg, I see the entire 6GB of swap wiped out.
Note I forgot I needed to start logind again. Although lennart told me logind is not supposed to be bus-activated, logind immediately restarted. This is from journalctl -b
, i.e. the system log, with no messages removed in between:
Mar 18 23:14:12 alan-laptop systemd[1]: Stopped Login Service.
Mar 18 23:14:12 alan-laptop dbus-daemon[831]: [system] Activating via systemd: service name='org.freedesktop.login1' unit='dbus-org.freedesktop.login1
Mar 18 23:14:12 alan-laptop systemd[1]: Starting Login Service...
This is relevant in that logind then went through a cycle of a few crashes. This is expected for this version of logind (PRs to fix it have been merged upstream, following my issue reports).
So this doesn't quite isolate an individual cause, and I really should have checked the fds logind was holding before killing it.
Question
Is there any possible swap user I have missed in the above checks? (The non-destructive ones, prior to EDIT1).
Is there a better way to get usage reports for any of the possible users I listed above? That is, either an alternative that corrects some inaccuracy I haven't noticed? Or something that will be easier to run, and get a quick result when this happens again?
Does anyone have a nice script to check for fds holding open a "hidden" tmpfs (a tmpfs which was detached with umount -l
)?
Does anyone have a nice way to check memory usage of memfds?
Is there any way to check for massive memfds having been left in limbo in an unread unix socket message? (Did any of these geniuses think about this at all when implementing memfds, which were explicitly intended for passing over unix sockets?)
EDIT2: Am I right to guess that a file descriptor of a graphics device (DRM), can hold a reference to swappable memory? Note logind
holds such file descriptors.
lsof
add any value with regard to open memfds? Sometimes I dump its output to a file, delete stuff I know is irrelevant, and review what's left. – Chris Davies Mar 18 '18 at 23:17stat --dereference /proc/$PID/fd/$FD
. – sourcejedi Mar 18 '18 at 23:37lsof
4.89. https://unix.stackexchange.com/a/431987/29483 – sourcejedi Mar 19 '18 at 00:25