The accepted answer is a one-liner:
# find / -type f -printf "%S\t%p\n" | gawk '$1 < 1.0 {print}'
So there's a number of parts to this. Let's break it down:
find / -type f
This part will search all files on the machine
-printf "%S\t\%p\n"
This part will print out the "sparsiness" of the file and the complete filename.
So the output, at this point will look like a list of entries in the following format:
1.23456 /tmp/a/file
If the first number is less than 1.0 then the file is considered "sparse".
So then we can filter this through awk:
gawk '$1 < 1.0 {print}'
This will limit the output to only the lines which are sparse by only reporting on those where the first number is < 1.0
The result is a list of all the files that are sparse, along with their "sparsiness".
That's a lot of work for a simple command!
If you just want to test to see if a specific file is sparse then you can use a variation of this. e.g.
find file_to_test -printf "%S"
will return a number. That can be tested to be < 1.0
find
part will need to be tailored for your requirements -- as posted, it searches all files from the root directory. The format after the -printf uses %S to show the estimated proportion of 'real' data blocks, and %p to show the pathname. This is piped to an awk script which only lists the files that use fewer disk blocks than a non-sparse file of the same size would need. – Paul_Pedant Oct 04 '20 at 21:07gfind -printf %S
approaches). In which way does it not address your question? – Stéphane Chazelas Oct 05 '20 at 11:57