Minimal block granularity example
Let's play a bit to see what is going on.
mount
tells me I'm on an ext4 partition mounted at /
.
I find its block size with:
stat -fc %s .
which gives:
4096
Now let's create some files with sizes 1 4095 4096 4097
:
#!/usr/bin/env bash
for size in 1 4095 4096 4097; do
dd if=/dev/zero of=f bs=1 count="${size}" status=none
echo "size ${size}"
echo "real $(du --block-size=1 f)"
echo "apparent $(du --block-size=1 --apparent-size f)"
echo
done
and the results are:
size 1
real 4096 f
apparent 1 f
size 4095
real 4096 f
apparent 4095 f
size 4096
real 4096 f
apparent 4096 f
size 4097
real 8192 f
apparent 4097 f
So we see that anything below or equal to 4096
takes up 4096
bytes in fact.
Then, as soon as we cross 4097
, it goes up to 8192
which is 2 * 4096
.
It is clear then that the disk always stores data at a block boundary of 4096
bytes.
What happens to sparse files?
I haven't investigated what is the exact representation is, but it is clear that --apparent
does take it into consideration.
This can lead to apparent sizes being larger than actual disk usage.
For example:
dd seek=1G if=/dev/zero of=f bs=1 count=1 status=none
du --block-size=1 f
du --block-size=1 --apparent f
gives:
8192 f
1073741825 f
Related: https://stackoverflow.com/questions/38718864/how-to-test-if-sparse-file-is-supported
What to do if I want to store a bunch of small files?
Some possibilities are:
Bibliography:
Tested in Ubuntu 16.04.