0

I have a log file on a CentOS system that is taking up 700MB of space (seen using ls), but when I run the df -h command, it shows that only 200MB of space is being used on the file system (ext4).

What could be causing this discrepancy?

Is it possible for a file to take up more space than is being reported by df, and if so, how can I tell which files are not using space?

edit: I clicked to fast, the other post don't answer my question. Here is the problem in a simplified form:

# ls -lh /mnt
total 29M
-rw-r--r-- 1 apache apache 678M Jan  6 10:01 Somelog.log
-rw-r--r-- 1 apache apache 1.1M Jan  1 03:20 Somelog.log-20230101.gz
-rw-r--r-- 1 apache apache 1.1M Jan  2 03:23 Somelog.log-20230102.gz
....etc....

du -sh /mnt

29M /mnt

I wan't some information about the file that's not counted in the total. (what's the term used when a file is still in memory ? if that's the case)

julesl
  • 103

1 Answers1

1

ls -l shows the apparent size of the file, i.e. how much data can be read from the file. du shows the amount of space the file actually occupies on disk.

In your case, the log file is sparse: it contains close to 27MiB of actual data, and around 650MiB of blocks which are all zeroes. The way the file was written results in the latter blocks taking up no room on disk, so they aren’t counted by du. The way this can happen is as follows:

  • a process writes to the log file, with 650MiB of real data;
  • the log file is rotated and cleared;
  • the initial process continues writing to the same log file, at the same offset where it finished writing before the log file was rotated.

The last step causes the file to be extended to the appropriate size, with no data, before the new data is appended.

The fix for this is to force the writing process to close and re-open the log file after it’s rotated, either by restarting the daemon, or by signalling it to re-open its log files (if it supports such a mechanism).

LustreOne
  • 1,774
Stephen Kitt
  • 434,908