6

Suddenly all the available disk space on / has disappeared.

If I make room in the disk (by deleting ~50GB of stuff, for example), after a few minutes I am back to 0 available disk space (according to df).

Clearly, some process is eating up disk space at a rapid rate, but I can't figure out what it is.

One thing is certain, though: whatever it is, it must be creating many small files, because there are no files bigger than 10GB on the disk, and all the ones bigger than 1GB are much older than today.

How can I find what's eating up disk space?


FWIW, only df sees the problem, not du.

For example, below I show several "snapshots" from du and df taken 60s. apart. (I did this after I had made some room in the disk.) Notice how du's output remains steady (at 495G), but df shows a steadily shrinking amount of available space. (I've followed the recommendation given here. IOW, /mnt/root is pointing to /.)

# while true; do du -sh /mnt/root && df -h /mnt/root; sleep 60; done
495G    /mnt/root
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       880G  824G   12G  99% /mnt/root
495G    /mnt/root
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       880G  825G   11G  99% /mnt/root
495G    /mnt/root
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       880G  827G  8.9G  99% /mnt/root
495G    /mnt/root
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       880G  827G  8.1G 100% /mnt/root
495G    /mnt/root
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       880G  828G  7.5G 100% /mnt/root
kjo
  • 15,339
  • 25
  • 73
  • 114
  • What happens if you restart syslog? – Rui F Ribeiro May 22 '17 at 19:15
  • 1
    @RuiFRibeiro: thanks for the idea... I've never restarted syslog, need to look into that... – kjo May 22 '17 at 19:16
  • @RuiFRibeiro: FWIW, /var is tiny (3.1G), and doesn't seem to be growing particularly fast. – kjo May 22 '17 at 19:19
  • You dealing with a deleted file, that is why du does not register it. The /var size, if syslog is the culprit, wont register it too. – Rui F Ribeiro May 22 '17 at 19:20
  • @RuiFRibeiro: sorry, I don't understand your point. – kjo May 22 '17 at 19:21
  • 1
    @RuiFRibeiro: a deleted file that is growing? – kjo May 22 '17 at 19:21
  • @RuiFRibeiro: OK, I restarted syslog, but I see no difference in du's output, nor in the size of /var. – kjo May 22 '17 at 19:27
  • deleted files only disappear after the process is stopped; they remain in use while that does not happen. what does it say? sudo lsof -nP | grep '(deleted)' – Rui F Ribeiro May 22 '17 at 19:29
  • @RuiFRibeiro: bingo, that command output ~800 lines. The entry in the COMMAND in most of the lines is server. A server process also shows up in iotop. I'll investigate further... Thanks! – kjo May 22 '17 at 19:37
  • @RuiFRibeiro: Thanks again. Your idea lead me to the culprit process. I killed it, and now, both df and du show 487G (out of 880G). If you post your comment as an answer, I'll accept it. – kjo May 22 '17 at 19:47

2 Answers2

9

You are dealing with deleted files, that is why du does not register used space, but dfdoes.

Deleted files only disappear after the owner process is stopped; they remain in use while that does not happen.

So to find the culprit process, I recommend you doing:

sudo lsof -nP | grep '(deleted)'

Then for killing the process.

sudo kill -9 $(lsof | grep deleted | cut -d " " -f4)
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
2

You could use iotop to see which processes are performing the most disk write operations.

Example:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
    2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
    3 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
    6 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
    7 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/0]
    8 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/1]
mehlj
  • 183