3

I have a weird problem on one of our servers. Almost half of my disk space is missing.

df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       271G  122G  149G  46% /
devtmpfs        3.8G     0  3.8G   0% /dev
tmpfs           3.8G  8.0K  3.8G   1% /dev/shm
tmpfs           3.8G  8.6M  3.8G   1% /run
tmpfs           3.8G     0  3.8G   0% /sys/fs/cgroup
/dev/sda1       497M  120M  378M  24% /boot
tmpfs           778M     0  778M   0% /run/user/600

On the other hand, du shows only 6GB used:

du -hs /
6.0G    /

This is a server where logs often fill the disk up to 100%, so my first response was to restart rsyslog daemon, but that had no effect. I also tried to reboot the server, so it can't be some files that are deleted but still in use from some process. I looked at https://serverfault.com/questions/299839/linux-disk-space-missing where someone suggested to do a fsck on reboot but that didn't help. On the same page, someone suggested to look for files on additional mount points, but there are none. I am looking for more suggestions.

The output of fdisk:

fdisk -l /dev/sda

Disk /dev/sda: 299.5 GB, 299506860032 bytes, 584974336 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 65536 bytes / 65536 bytes Disk label type: dos Disk identifier: 0x000f1d8a

Device Boot Start End Blocks Id System /dev/sda1 * 2048 1026047 512000 83 Linux /dev/sda2 1026048 17018879 7996416 82 Linux swap / Solaris /dev/sda3 17018880 584974335 283977728 83 Linux

lsblk output:

lsblk -f /dev/sda
NAME   FSTYPE LABEL UUID                                 MOUNTPOINT
sda                                                      
├─sda1 xfs          78e8f824-1a2a-4c60-ab7b-6126a192932d /boot
├─sda2 swap         bdbe969d-c59d-4956-ae69-71e2825f93dc [SWAP]
└─sda3 xfs          a9c9da10-5e99-4a14-a207-490e3f676617 /

`xfs_quota -x -c 'free -h -b'` output:
Filesystem     Size   Used  Avail Use% Pathname
/dev/sda3    270.7G 127.1G 143.5G  47% /
/dev/sda1    496.5M 119.0M 377.5M  24% /boot
Filesystem     Size   Used  Avail Use% Pathname
/dev/sda3    270.7G 127.1G 143.5G  47% /
/dev/sda1    496.5M 119.0M 377.5M  24% /boot

xfs_quota -x -c 'quota -h' doesn't return anything. Nobody set any quotas. It's one server out of several hundred with the same configuration and partition layout deployed at our branch offices, but only this has this problem. Because of some specific reasons, it's the only one that gets its disk filled regularly to 100%. We delete some logs manually every one or two weeks.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
user1403360
  • 2,020

4 Answers4

3

It seems likely there are a significant number of files hidden under one of the boot, dev, run, sys mountpoint directories that are usually inaccessible due to other filesystems being mounted there. Try this to access them from your running system:

mkdir /mnt/root
mount --bind / /mnt/root
du -hs /mnt/root/

If the du returns significantly more that your reported 6 GB used then this is almost certainly the issue. Use this to identify where the missing files are hiding:

du -hs /mnt/root/{boot,dev,run,sys}

Remember that /mnt/root really is your root / filesystem, so treat deletions or other file manipulations with great care. In any case do not try to delete any directories directly under /mnt/root that might be used as mountpoints.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • I tried this and i didn't find anything. – user1403360 Sep 12 '22 at 06:41
  • i didn't play with partitions, so its highly unlikely that i have something hidden under a mount point. Server stayed the same since installation. Actually it's 1 server out of several 100 with the same configuration and partition layout. But only this has this problem. – user1403360 Sep 14 '22 at 10:15
  • @user1403360 you haven't answered my question from earlier. Did you run du -hs / as root? – Chris Davies Sep 14 '22 at 12:38
  • @user1403360 I can't help you if you don't answer the questions seeking clarification. Did you try du -hs / as root? If not, that can also explain why it can't see all your files – Chris Davies Sep 15 '22 at 17:38
1

The best tool to figure out what's taking space on a fs (but NOT deleted-but-sill-open-files) is ncdu. Try ncdu -qx / to diagnose your rootfs.

Since your problem persists after a reboot, it seems that you are not a victim of deleted-but-sill-open-files.

zerodeux
  • 241
0

On running system you should check disk usage as root user to have access to all files and dirs.

Maybe your root fs has quota enabled. Check xfs_quota -x -c 'free -hi' /

Boot from live cd (gparted, systemrescue) and mount /dev/sda3 /mnt and check du -csh /mnt/* and df -H.

Then unmount /dev/sda3 and check unmounted xfs filesystem on /dev/sda3 with xfs_repair -n /dev/sda3.

Read https://mankier.com/8/xfs_repair to know what checks it performs.

gapsf
  • 606
0

It's 1 server out of several 100 with the same configuration and partition

Can you estmate what real fs usage should be? Is it 6GB is to small?

df get info with statvfs() wich reports number of free blocks from superblock so 'usage' is calculated value.

By default du also counts blocks used by each file and dir it founds.

Because du < df filesystem have

  • blocks not counted by fs as free and wich du can't account via directory entries or

  • directory entries du can't access for some reason. Here check permissions, ACLs, SELinux, AppArmor, long filenames https://unix.stackexchange.com/a/619878/153329

Take into account next:

Also because of some specific it's the only one that gets it's disk filled regularly to 100%. We delete some logs manually every one or two weeks.

If you delete opened file, directory entry with filename is gone but inode is there and kernel delete inode and freeup blocks later when filehandle is closed.

Computer with big uptime may have large number of orphaned inodes, so reboot should help.

fsck on reboot and it didn't help.

For some reasonse such orphaned inodes may not be deleted by kernel from a bug, power outage, previous directory corruptions so reboot didnt help.

I recommend check umounted filesystem with xfs_repair /dev/sda3 manually one last time.

If it doesnt help maybe filesystem is corrupted in a way xfs_repair cant correctly update freemap.

Mostly you should trust du.

Also check next.

Compare du -h with du -bh /.

Find sparse files:

find / -type f -printf "%S\t%p\n" | gawk '$1 < 1.0 {print}'

Maybe there is a huge number of small files with size less then fs block size.

xfs_info / | grep bsize
find / -type f -size -4096c | wc -l00
gapsf
  • 606
  • Here du usually says it if there's a directory (tree) it can access because of permissions, if the OP's does the same, he should have noticed that. So I don't think that's a likely reason. If du can access a directory because something else is mounted over it, there's no way for du to even know there's something there, and as a consequence it can't say anything. – Henrik supports the community Sep 16 '22 at 09:46
  • @Henriksupportsthecommunity OP already says about that https://unix.stackexchange.com/questions/714798/missing-disk-space-on-server/717443?noredirect=1#comment1359162_716821 – gapsf Sep 16 '22 at 09:59