11

We have some EC2 servers that experience a memory leak over days or weeks. Eventually there gets to be many GB of memory that is used (according to tools like free and htop) and, if we don't restart the server, our processes start getting OOM-killed.

One such server has 15GB of ram. Here's the output of free -m:

             total       used       free     shared    buffers     cached
Mem:         15039       3921      11118          0          0          7
-/+ buffers/cache:       3913      11126
Swap:            0          0          0

This server is sitting idle; I've killed most userland processes. No process in htop is showing >100k VIRT. I recently ran echo 3 > /proc/sys/vm/drop_caches, to no effect (that's why buffers and cached is so small). Additionally:

  • Poking around in /proc/slabinfo and slabtop doesn't show anything promising
  • There's nothing in /run/shm

Here's the output of cat /proc/meminfo:

MemTotal:       15400880 kB
MemFree:        11385688 kB
Buffers:             564 kB
Cached:             7792 kB
SwapCached:            0 kB
Active:            27668 kB
Inactive:           2012 kB
Active(anon):      21368 kB
Inactive(anon):      380 kB
Active(file):       6300 kB
Inactive(file):     1632 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         21380 kB
Mapped:             7208 kB
Shmem:               380 kB
Slab:              39260 kB
SReclaimable:      16456 kB
SUnreclaim:        22804 kB
KernelStack:        1352 kB
PageTables:         2872 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     7700440 kB
Committed_AS:      39072 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       30336 kB
VmallocChunk:   34359691552 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       36864 kB
DirectMap2M:    15822848 kB

You can see that there's a large gap between MemFree and MemTotal that isn't explained by the other meminfo metrics.

Any idea where this memory went, or how I can debug further?

More server info, in case it's relevant:

$ lsb_release -d
Description:    Ubuntu 14.04.1 LTS
$ uname -r
3.13.0-36-generic

Update: Here are some more commands and their output:

# dmesg | fgrep 'Memory:'
[    0.000000] Memory: 15389980K/15728244K available (7373K kernel code, 1144K rwdata, 3404K rodata, 1336K init, 1440K bss, 338264K reserved)

# awk '{print $2 " " $1}' /proc/modules  | sort -nr | head -5
106678 psmouse
97812 raid6_pq
86484 raid456
69418 floppy
55624 aesni_intel

# cat /proc/mounts | grep tmp
udev /dev devtmpfs rw,relatime,size=7695004k,nr_inodes=1923751,mode=755 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=1540088k,mode=755 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
# df -h /dev /run /sys/fs/cgroup /run/lock /run/shm /run/user
Filesystem      Size  Used Avail Use% Mounted on
udev            7.4G   12K  7.4G   1% /dev
tmpfs           1.5G  368K  1.5G   1% /run
none            4.0K     0  4.0K   0% /sys/fs/cgroup
none            5.0M     0  5.0M   0% /run/lock
none            7.4G     0  7.4G   0% /run/shm
none            100M     0  100M   0% /run/user

Update 2: Here's the entire output of ps aux:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  33636  2368 ?        Ss    2015   0:03 /sbin/init
root         2  0.0  0.0      0     0 ?        S     2015   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S     2015   0:11 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/0:0H]
root         7  0.0  0.0      0     0 ?        S     2015   1:31 [rcu_sched]
root         8  0.0  0.0      0     0 ?        S     2015   0:30 [rcuos/0]
root         9  0.0  0.0      0     0 ?        S     2015   0:25 [rcuos/1]
root        10  0.0  0.0      0     0 ?        S     2015   0:33 [rcuos/2]
root        11  0.0  0.0      0     0 ?        S     2015   0:25 [rcuos/3]
root        12  0.0  0.0      0     0 ?        S     2015   0:14 [rcuos/4]
root        13  0.0  0.0      0     0 ?        S     2015   0:14 [rcuos/5]
root        14  0.0  0.0      0     0 ?        S     2015   0:14 [rcuos/6]
root        15  0.0  0.0      0     0 ?        S     2015   0:33 [rcuos/7]
root        16  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/8]
root        17  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/9]
root        18  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/10]
root        19  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/11]
root        20  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/12]
root        21  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/13]
root        22  0.0  0.0      0     0 ?        S     2015   0:00 [rcuos/14]
root        23  0.0  0.0      0     0 ?        S     2015   0:00 [rcu_bh]
root        24  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/0]
root        25  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/1]
root        26  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/2]
root        27  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/3]
root        28  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/4]
root        29  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/5]
root        30  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/6]
root        31  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/7]
root        32  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/8]
root        33  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/9]
root        34  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/10]
root        35  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/11]
root        36  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/12]
root        37  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/13]
root        38  0.0  0.0      0     0 ?        S     2015   0:00 [rcuob/14]
root        39  0.0  0.0      0     0 ?        S     2015   0:01 [migration/0]
root        40  0.0  0.0      0     0 ?        S     2015   0:06 [watchdog/0]
root        41  0.0  0.0      0     0 ?        S     2015   0:05 [watchdog/1]
root        42  0.0  0.0      0     0 ?        S     2015   0:00 [migration/1]
root        43  0.0  0.0      0     0 ?        S     2015   0:08 [ksoftirqd/1]
root        45  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/1:0H]
root        46  0.0  0.0      0     0 ?        S     2015   0:05 [watchdog/2]
root        47  0.0  0.0      0     0 ?        S     2015   0:01 [migration/2]
root        48  0.0  0.0      0     0 ?        S     2015   0:08 [ksoftirqd/2]
root        50  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/2:0H]
root        51  0.0  0.0      0     0 ?        S     2015   0:06 [watchdog/3]
root        52  0.0  0.0      0     0 ?        S     2015   0:01 [migration/3]
root        53  0.0  0.0      0     0 ?        S     2015   0:17 [ksoftirqd/3]
root        55  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/3:0H]
root        56  0.0  0.0      0     0 ?        S     2015   0:07 [watchdog/4]
root        57  0.0  0.0      0     0 ?        S     2015   0:01 [migration/4]
root        58  0.0  0.0      0     0 ?        S     2015   0:02 [ksoftirqd/4]
root        60  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/4:0H]
root        61  0.0  0.0      0     0 ?        S     2015   0:06 [watchdog/5]
root        62  0.0  0.0      0     0 ?        S     2015   0:01 [migration/5]
root        63  0.0  0.0      0     0 ?        S     2015   0:07 [ksoftirqd/5]
root        65  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/5:0H]
root        66  0.0  0.0      0     0 ?        S     2015   0:06 [watchdog/6]
root        67  0.0  0.0      0     0 ?        S     2015   0:01 [migration/6]
root        68  0.0  0.0      0     0 ?        S     2015   0:04 [ksoftirqd/6]
root        70  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/6:0H]
root        71  0.0  0.0      0     0 ?        S     2015   0:06 [watchdog/7]
root        72  0.0  0.0      0     0 ?        S     2015   0:02 [migration/7]
root        73  0.0  0.0      0     0 ?        S     2015   0:17 [ksoftirqd/7]
root        74  0.0  0.0      0     0 ?        S     2015   0:14 [kworker/7:0]
root        75  0.0  0.0      0     0 ?        S<    2015   0:00 [kworker/7:0H]
root        76  0.0  0.0      0     0 ?        S<    2015   0:00 [khelper]
root        77  0.0  0.0      0     0 ?        S     2015   0:00 [kdevtmpfs]
root        78  0.0  0.0      0     0 ?        S<    2015   0:00 [netns]
root        79  0.0  0.0      0     0 ?        S     2015   0:00 [xenwatch]
root        80  0.0  0.0      0     0 ?        S     2015   0:00 [xenbus]
root        81  0.0  0.0      0     0 ?        S     2015   0:39 [kworker/0:1]
root        82  0.0  0.0      0     0 ?        S<    2015   0:00 [writeback]
root        83  0.0  0.0      0     0 ?        S<    2015   0:00 [kintegrityd]
root        84  0.0  0.0      0     0 ?        S<    2015   0:00 [bioset]
root        86  0.0  0.0      0     0 ?        S<    2015   0:00 [kblockd]
root        88  0.0  0.0      0     0 ?        S<    2015   0:00 [ata_sff]
root        89  0.0  0.0      0     0 ?        S     2015   0:00 [khubd]
root        90  0.0  0.0      0     0 ?        S<    2015   0:00 [md]
root        91  0.0  0.0      0     0 ?        S<    2015   0:00 [devfreq_wq]
root        92  0.0  0.0      0     0 ?        S     2015   0:12 [kworker/1:1]
root        95  0.0  0.0      0     0 ?        S     2015   0:10 [kworker/4:1]
root        97  0.0  0.0      0     0 ?        S     2015   0:11 [kworker/6:1]
root        99  0.0  0.0      0     0 ?        S     2015   0:00 [khungtaskd]
root       100  0.0  0.0      0     0 ?        S     2015   7:26 [kswapd0]
root       101  0.0  0.0      0     0 ?        SN    2015   0:00 [ksmd]
root       102  0.0  0.0      0     0 ?        SN    2015   0:29 [khugepaged]
root       103  0.0  0.0      0     0 ?        S     2015   0:00 [fsnotify_mark]
root       104  0.0  0.0      0     0 ?        S     2015   0:00 [ecryptfs-kthrea]
root       105  0.0  0.0      0     0 ?        S<    2015   0:00 [crypto]
root       117  0.0  0.0      0     0 ?        S<    2015   0:00 [kthrotld]
root       119  0.0  0.0      0     0 ?        S     2015   0:00 [scsi_eh_0]
root       120  0.0  0.0      0     0 ?        S     2015   0:00 [scsi_eh_1]
root       141  0.0  0.0      0     0 ?        S<    2015   0:00 [deferwq]
root       142  0.0  0.0      0     0 ?        S<    2015   0:00 [charger_manager]
root       199  0.0  0.0      0     0 ?        S<    2015   0:00 [kpsmoused]
root       223  0.0  0.0      0     0 ?        S<    2015   0:00 [bioset]
root       265  0.0  0.0      0     0 ?        S<    2015   0:00 [raid5wq]
root       291  0.0  0.0      0     0 ?        S     2015   0:22 [jbd2/xvda1-8]
root       292  0.0  0.0      0     0 ?        S<    2015   0:00 [ext4-rsv-conver]
root       445  0.0  0.0      0     0 ?        S     2015   0:16 [jbd2/md0-8]
root       446  0.0  0.0      0     0 ?        S<    2015   0:00 [ext4-rsv-conver]
root       516  0.0  0.0  19604   564 ?        S     2015   0:00 upstart-udev-bridge --daemon
root       522  0.0  0.0  49864  1048 ?        Ss    2015   0:00 /lib/systemd/systemd-udevd --daemon
root       671  0.0  0.0  15256   408 ?        S     2015   0:00 upstart-socket-bridge --daemon
root       800  0.0  0.0  10220  2900 ?        Ss    2015   0:00 dhclient -1 -v -pf /run/dhclient.eth0.pid -l
message+  1048  0.0  0.0  39224  1048 ?        Ss    2015   0:00 dbus-daemon --system --fork
root      1077  0.0  0.0      0     0 ?        S    Jan04   0:00 [kworker/u30:2]
root      1082  0.0  0.0  43448  1196 ?        Ss    2015   0:00 /lib/systemd/systemd-logind
root      1116  0.0  0.0  15272   512 ?        S     2015   0:00 upstart-file-bridge --daemon
root      1339  0.0  0.0  14536   412 tty4     Ss+   2015   0:00 /sbin/getty -8 38400 tty4
root      1344  0.0  0.0  14536   416 tty5     Ss+   2015   0:00 /sbin/getty -8 38400 tty5
root      1360  0.0  0.0  14536   408 tty2     Ss+   2015   0:00 /sbin/getty -8 38400 tty2
root      1361  0.0  0.0  14536   416 tty3     Ss+   2015   0:00 /sbin/getty -8 38400 tty3
root      1363  0.0  0.0  14536   404 tty6     Ss+   2015   0:00 /sbin/getty -8 38400 tty6
root      1418  0.0  0.0  61364  1296 ?        Ss    2015   0:07 /usr/sbin/sshd -D
root      1432  0.0  0.0  23652   552 ?        Ss    2015   0:02 cron
daemon    1433  0.0  0.0  19136   180 ?        Ss    2015   0:00 atd
root      1461  0.0  0.0  19316   644 ?        Ss    2015   1:57 /usr/sbin/irqbalance
root      1518  0.0  0.0   4364   404 ?        Ss    2015   0:00 acpid -c /etc/acpi/events -s /var/run/acpid.
root      1521  0.0  0.0      0     0 ?        S     2015   0:00 [kworker/5:1]
root      1641  0.0  0.0      0     0 ?        S    Jan04   0:00 [kworker/u30:1]
root      1863  0.0  0.0  14536   404 tty1     Ss+   2015   0:00 /sbin/getty -8 38400 tty1
root      1864  0.0  0.0  12784   388 ttyS0    Ss+   2015   0:00 /sbin/getty -8 38400 ttyS0
ntp       2075  0.0  0.0  31448  1252 ?        Ss    2015   1:17 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 10
root      2087  0.0  0.0      0     0 ?        S     2015   0:00 [kauditd]
ubuntu    2393  0.0  0.0 105628  2028 ?        S    Jan04   0:00 sshd: ubuntu@notty
root      2496  0.0  0.0      0     0 ?        S    Jan04   0:00 [kworker/2:2]
root      2713  0.0  0.0      0     0 ?        S     2015   0:00 [kworker/6:2]
root      2722  0.0  0.0      0     0 ?        S     2015   0:12 [kworker/5:2]
root      3678  0.0  0.0      0     0 ?        S    Jan05   0:01 [kworker/0:0]
root      3716  0.0  0.0      0     0 ?        S    Jan05   0:00 [kworker/3:0]
root      3941  0.0  0.0      0     0 ?        S    Jan05   0:00 [kworker/2:0]
root      4732  0.0  0.0      0     0 ?        S    Jan05   0:00 [kworker/1:2]
root      6896  0.0  0.0 105628  4228 ?        Ss   08:00   0:00 sshd: ubuntu [priv]
ubuntu    7008  0.0  0.0 105628  1876 ?        S    08:00   0:00 sshd: ubuntu@pts/0
ubuntu    7014  0.0  0.0  21308  3908 pts/0    Ss   08:00   0:00 -bash
root      7234  0.0  0.0  63668  2096 pts/0    S    08:10   0:00 sudo su
root      7235  0.0  0.0  63248  1776 pts/0    S    08:10   0:00 su
root      7236  1.0  0.0  21088  3456 pts/0    S    08:10   0:00 bash
root      7248  0.0  0.0  17164  1320 pts/0    R+   08:10   0:00 ps aux
root     13299  0.0  0.0      0     0 ?        S     2015   0:19 [kworker/3:2]
root     19933  0.0  0.0      0     0 ?        S     2015   0:00 [kworker/7:1]
root     20305  0.0  0.0      0     0 ?        S     2015   0:00 [kworker/4:2]
root     29814  0.0  0.0      0     0 ?        S<   Jan04   0:00 [kworker/u31:2]
root     30693  0.0  0.0      0     0 ?        S<   Jan04   0:00 [kworker/u31:1]
  • 2
    Do you have tmpfs mounted somewhere? Maybe it gets filled and memory usage grew "for no reason". And what does dmesg | fgrep Memory: show? –  Jan 05 '16 at 03:18
  • @siblynx This can be seen with df -h --type=tmpfs. – Hauke Laging Jan 05 '16 at 03:31
  • @siblynx thanks for the suggestions. I do have a tmpfs but it is small: df -h | grep tmpfstmpfs 1.5G 364K 1.5G 1% /run. As for dmesg: dmesg | fgrep 'Memory:'[ 0.000000] Memory: 15389980K/15728244K available (7373K kernel code, 1144K rwdata, 3404K rodata, 1336K init, 1440K bss, 338264K reserved) – Caleb Spare Jan 05 '16 at 05:13
  • 1
    I believe something in kernel then is eating your memory, perhaps modules. Check http://unix.stackexchange.com/questions/97261/how-much-ram-does-the-kernel-use for additional memory debugging techniques and make sure you have no other tmpfs left mounted and filled, by finding tmpfs entries in /proc/mounts. –  Jan 05 '16 at 05:46
  • @siblynx Nothing in lsmod output shows a Size greater than ~100K. Slabtop says the total size is 38978.69K which roughly matches the Slab row in /proc/meminfo. No tmpfs mounts seem like they have any significant size. I added more command output; see "Update" section. – Caleb Spare Jan 05 '16 at 06:48
  • This is strange, even amount of address space memory allocated to all processes shown by Committed_AS: is small. Probably a kernel memory leak. Do the memory actually leak without reason constantly and sooner or later you get OOM? And you did not tuned the system memory allocation policies in any way? –  Jan 05 '16 at 07:57
  • @siblynx The memory doesn't seem to leak more while the server is sitting idle. When active, these servers run some Java and C++ programs that must be tickling a bug. (But none of our userland processes were running when I ran all the above commands.) I don't think we've changed any memory allocation policies. – Caleb Spare Jan 05 '16 at 16:30
  • You don't see anything unusual in your ps aux output, really? –  Jan 06 '16 at 07:20
  • @siblynx nope. I added the entire ps aux output as Update 2, if you'd like to take a look. – Caleb Spare Jan 06 '16 at 08:13
  • This sounds like it would be a good use for dtrace (or whatever the native linux equivalent is) you could add probes for malloc and see whats allocating the memory and if its user or kernel space. For example https://wiki.freebsd.org/DTrace/One-Liners#Raw_Kernel_Tracing – ellipse-of-uncertainty Feb 10 '16 at 20:51
  • You should look at http://unix.stackexchange.com/a/97265/167583 as it seems to give a very complete solution to clarify the usage. – Julie Pelletier Apr 28 '16 at 05:41
  • @JuliePelletier I believe I addressed every single item in that answer within my posting. – Caleb Spare Apr 30 '16 at 00:26
  • 1
    @Caleb: Have you considered that the 4GB may be what the kernel considers an appropriate inode cache size based on your total memory. From some verifications I did yesterday in the kernel source (fs/inode.c), the amount assigned to inode cache is proportional to the total memory when the kernel is booted. – Julie Pelletier Apr 30 '16 at 00:33
  • No need to worry about cache usage though as the kernel will always prioritize processes if it start to run out of memory. – Julie Pelletier Apr 30 '16 at 00:36
  • @JuliePelletier slabtop didn't show the inode cache taking significant space. Also I mentioned that echo 3 > /proc/sys/vm/drop_caches had no effect. So it's not the inode cache. (Also the issue DOES cause bad OOM problems, so it's not some benign cache I'm unaware of. It's definitely a bug.) – Caleb Spare Jun 01 '16 at 02:01
  • Look at your output of free: -/+ buffers/cache: 3913 11126 which explains 4GB of memory assigned to the cache and buffers. It doesn't matter how many other commands you run if you don't pay attention to obvious things like that. I even pointed you to the inode cache source code which explains in detail that it will assign a fixed amount based on the total RAM present. Even if you flush it, it won't deallocate it. – Julie Pelletier Jun 01 '16 at 03:55
  • @JuliePelletier The 3913 number is in the "used" column, not memory used by cache and buffers. The "-/+ buffers/cache" row of free output indicates that only 8MB of the top-line "used" number is occupied by buffers/cache. – Caleb Spare Jun 02 '16 at 17:44
  • @JuliePelletier further, if your "fixed amount based on total RAM present" theory is correct, (a) why would that amount be such a giant percentage of available ram, (b) why does the mysteriously missing memory grow over time, and (c) why is the problem fixed by rebooting the machine? – Caleb Spare Jun 02 '16 at 17:45
  • You have 7500 programs requesting a Page Size of 2M. (15822848 kb = 15.822848 Gb), and 15Gb / 2Mb = 7500. See On system memory… specifically the difference between tmpfs, shm, and hugepages…, so my question is why so many programs? – eyoung100 Jun 11 '16 at 03:19
  • @eyoung100 I don't think that's relevant; I just looked at a similar, healthy machine (without the "missing" memory) and it has DirectMap2M: 15802368 kB. – Caleb Spare Jun 14 '16 at 20:25
  • Since the 2M maps are nearly equal, whats the configuration difference between the machines? – eyoung100 Jun 14 '16 at 20:31
  • Did you read the OOM killer output? It's usually fairly informative. Paste a sample here for review. – Josip Rodin Jul 09 '16 at 14:02

1 Answers1

2

Memory leaks can be a real pain and quite frustrating to trace on a large scale system. I'd try to copy the entire server into a test environment, starting services one at a time to isolate the problem.

After you check each of the services (the user mode processes) separately and individually and still can't seem to find the source of the leak, you should check the kernel. Dealing with the kernel takes time and experienced hands, I'd recommend consulting a kernel specialist.

Another possibility is the presence of a malware. Dealing with malware is an entirely different opera.

Sometimes there aren't shortcuts :\

Stephen C
  • 825