3

We have a server that runs a huge server process to cache some data. Sometimes the process crashes, and the memory is not freed unless the machine is rebooted.

How can I free this stuck memory without rebooting?

Before the crash if I sum the RSS column on ps the result is a value closer to the used memory on free. After the crash it have a huge difference.

  • OS: RHEL 7.1
  • Kernel: 3.10.0-229.el7.x86_64

Here before the crash:

$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       22834        4994         191        4185        8679
Swap:          4095          51        4044
$ ps -aux | awk 'NR > 1 { s += $6 } END { print s / 1024 }'
22534.5

"Crash" message in dmesg:

[58306.926623] Out of memory: Kill process 25047 (java) score 383 or sacrifice child
[58306.926727] Killed process 25047 (java) total-vm:32134064kB, anon-rss:10103908kB, file-rss:0kB

Memory usage after the crash:

$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       21188       10378          32         447       10567
Swap:          4095           0        4095
$ ps -aux | awk 'NR > 1 { s += $6 } END { print s / 1024 }'
250.039

I also looked at /proc/meminfo. It doesn't show the 20GB used anywhere! The biggest overall uses are Cached, AnonPages and Slab, but they only add up to 0-2 GB in total.

Note tmpfs memory and System V IPC shared is all counted as "shared" ( Shmem field in /proc/meminfo).

/proc/vmstat has some lower-level nr_* counters, but they show the same story as meminfo. No explanation for the 20GB used.

Memory chart:

Imgur

It probably have sent some alert e-mails due to surpass 90% but it was after work time, so no one took any providence. The gap near 2 'o clock is due to the metrics exporter service being killed by lack of memory, but it self relaunched after a while. This chart exclude cache and buffer, so when it arrive 90% it means (used - cache_and_buffer) / total.

Update!

Before crash:

[user@hostname ~]$ vmware-toolbox-cmd stat balloon
0 MB

After crash:

[user@hostname ~]$ vmware-toolbox-cmd stat balloon
20809 MB

See Tracking down Linux memory usage when not showing up in cache

Are vmware kidding me?? How can it allow the provider to allocate more memory resources than it actually have?

Appendix

Details to help back up the above (and maybe provide some more search keywords). After the crash:

$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       21188       10378          32         447       10567
Swap:          4095           0        4095
$ ps -aux | awk 'NR > 1 { s += $6 } END { print s / 1024 }'
250.039
$ sudo ps -aux | awk 'NR > 1 { s += $6 } END { print s / 1024 }'
252.645
$ df -h | grep tmpfs
devtmpfs                       16G     0   16G   0% /dev
tmpfs                          16G     0   16G   0% /dev/shm
tmpfs                          16G   33M   16G   1% /run
tmpfs                          16G     0   16G   0% /sys/fs/cgroup

$ echo 3 | sudo tee /proc/sys/vm/drop_caches
3
$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       21186       10685          32         142       10635
Swap:          4095           0        4095

$ cat /proc/meminfo 
MemTotal:       32782584 kB
MemFree:        10932472 kB
MemAvailable:   10886376 kB
Buffers:               0 kB
Cached:            77292 kB
SwapCached:            0 kB
Active:            98160 kB
Inactive:         107692 kB
Active(anon):      71652 kB
Inactive(anon):    90228 kB
Active(file):      26508 kB
Inactive(file):    17464 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4194300 kB
SwapFree:        4194300 kB
Dirty:                20 kB
Writeback:             0 kB
AnonPages:        128604 kB
Mapped:            52428 kB
Shmem:             33320 kB
Slab:              77956 kB
SReclaimable:      32772 kB
SUnreclaim:        45184 kB
KernelStack:        4112 kB
PageTables:         6140 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    20585592 kB
Committed_AS:     414536 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      212300 kB
VmallocChunk:   34359490812 kB
HardwareCorrupted:     0 kB
AnonHugePages:      2048 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       59328 kB
DirectMap2M:    33495040 kB

On other time it crashed I tried to sync before drop_caches but it did not help.

$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       21700        9057           6        1256       10043
Swap:          4095          40        4055
$ sync; echo 2 | sudo tee /proc/sys/vm/drop_caches  # drop slabs
2
$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       21682        9508           6         822       10089
Swap:          4095          40        4055
$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches  # drop page cache too
3
$ free -m
              total        used        free      shared  buff/cache   available
Mem:          32014       21686       10197           6         130       10155
Swap:          4095          40        4055

System V IPC shared memory is only using 15 kilobytes:

$ sudo ipcs
------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages    

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x9f4efb88 0          root       777        88         3                       
0x75ebee7a 98305      root       777        1544       1                       
0xa763f6de 131074     root       777        1544       1                       
0x5ea28805 163843     root       777        1544       1                       
0x6e7496e4 196612     root       777        1544       1                       
0x73e8d447 229381     root       777        1544       1                       
0x056bc027 262150     root       777        1544       1                       
0x9ed89c09 294919     root       777        1544       1                       
0x3af6b86e 327688     root       777        1544       1                       
0x97b75d57 360457     root       777        1544       1                       

------ Semaphore Arrays --------
key        semid      owner      perms      nsems     

For /proc/meminfo before the crash, and /proc/vmstat, I left them here as they are long and not any more helpful: https://gist.github.com/tiagoapimenta/e88bfd7ead2437e2862f293cb18bc7cf

  • 1
    Can you check the output of ipcs and post it here? If memory is being managed using System V IPCs (shm) that might explain what you're seeing, not freeing it after a crash... In that case, you can try ipcrm to free it too. – filbranden Oct 25 '18 at 14:49
  • @sourcejedi kernel: 3.10.0-229.el7.x86_64 OS: RHEL 7.1 – Tiago Pimenta Oct 25 '18 at 14:56
  • @filipe-brandenburger Thanks for the tip, I already rebooted the machine, next time it crashes I'll provide you this data. – Tiago Pimenta Oct 25 '18 at 14:56
  • Output of ipcs during normal execution is also useful to understand whether your caching server uses it at all... (If it doesn't, I wouldn't expect it to be the source of the leak...) – filbranden Oct 25 '18 at 15:09
  • When you do drop_caches, consider doing sync first. Dirty caches cannot be dropped. I don't know that it's the problem but it would be nice to rule out. Some other users seem to have large caches that are not attributed to anything, but which disappear when you force the slabs caches to be dropped. https://unix.stackexchange.com/questions/442840/slab-cache-affecting-used-memory-much-more-than-proc-meminfo-suggests – sourcejedi Oct 25 '18 at 15:37
  • 1
    Actually your /proc/meminfo shows Shmem: 33320 kB which matches the 33M from /run, so it doesn't look like IPC shmem is really involved here... – filbranden Oct 25 '18 at 16:14
  • @sourcejedi In other past opportunity I have already done sync before drop_caches and unfortunately the result was not so different, it is more than 20G of memory stuck, in fact most partitions are xfs, to eliminate any doubt I'll try again next time, maybe with 2 instead of 3 on drop_caches as suggested, then I'll bring the results. – Tiago Pimenta Oct 26 '18 at 10:37
  • @TiagoPimenta Thanks for the update. echo 2 | sudo tee /proc/sys/vm/drop_caches is fine as far as I am concerned, I was only interested in dropping slabs. As far as I know, dropping page cache will not help. (3 = drop the slab caches AND the page cache). – sourcejedi Oct 29 '18 at 08:54
  • ZFS can use a lot of memory, and might not show up in these tools. – muru Oct 29 '18 at 09:19
  • 1
    @sourcejedi Unfortunately sync before drop_caches didn't work. – Tiago Pimenta Oct 29 '18 at 17:25
  • @muru I do not have ZFS, but I have XFS, I'll look for some proc flags that could cause something similar what you said about ZFS. – Tiago Pimenta Oct 29 '18 at 17:26
  • Do you have /proc/vmstat? meminfo is slightly synthetic, I think vmstat shows more about the raw counters. In general it seems very strange behaviour from the kernel, maybe it is a bug. I wonder if ServerFault would have more to say about RHEL servers... at least maybe if it was a known bug fixed in a later version of RHEL. – sourcejedi Nov 21 '18 at 16:04
  • @sourcejedi can you see the linked file? I extracted it in order to not be so huge. – Tiago Pimenta Nov 21 '18 at 16:46
  • There's https://www.kernel.org/doc/html/v4.18/dev-tools/kmemleak.html . I think the kernel-debug package can be booted with kmemleak=on, and you could go from there. Probably not a good idea for a production server though :-). – sourcejedi Nov 21 '18 at 17:16
  • I assume you understand the "crash" message (out of memory killer). Seems like something is "leaking" kernel memory, eventually the kernel out of memory killer is called, so it kills your java process. It managed to get rid of the ~10GB your process was actually using, hence why you have ~10GB free afterwards. But it didn't fix the real memory leak. – sourcejedi Nov 21 '18 at 17:20
  • Yeah, it have -Xmx24g, I do not know why is crashing, but the main problem is it requiring a reboot, I want to simply relaunch the application... – Tiago Pimenta Nov 21 '18 at 17:22
  • I don't use RHEL or memory ballooning, but maybe you can read https://unix.stackexchange.com/questions/323186/tracking-down-linux-memory-usage-when-not-showing-up-in-cache and see if that applies. The linked answer is VMware-specific – sourcejedi Nov 21 '18 at 20:32
  • I tried sudo rmmod vmw_balloon, let's see if it fix the problem – Tiago Pimenta Nov 22 '18 at 11:03
  • I've editted meminfo back in, because I think the question needs to at least include the Slab figure. I suggest we close this story as a duplicate, then if you have a specific followup question about VMware design / your provider / why the ballooned memory mysteriously isn't a problem after you reboot! - you can ask a new question, & link to this one. – sourcejedi Nov 22 '18 at 11:03
  • @sourcejedi it seems too messy, but if you think it will be better, so let it be. – Tiago Pimenta Nov 22 '18 at 11:05
  • I know, sorry... It could be cleaned up more if you want to spend the time, but the quickest way to explain was to put the meminfo back in. I really want to keep that Slab figure in the question. – sourcejedi Nov 22 '18 at 11:07
  • @sourcejedi I would like to try sudo rmmod vmw_balloon before accepting the other answer, once I do not have access to vmware, that post indeed helped to track but do not solved mine. – Tiago Pimenta Nov 22 '18 at 12:13
  • Dude, it crashes everywhere! When I disabled this module on every server, they didn't stop to crash each other! I believe the only acceptable answer is to change vmware itself, unfortunately... – Tiago Pimenta Nov 22 '18 at 13:22

0 Answers0