I'm curious to understand how du counts blocks used in a file.
Scenario
dd bs=1 seek=2GiB if=/dev/null of=big
0+0 records in
0+0 records out
0 bytes (0 B) copied, 2.3324e-05 s, 0.0 kB/s
ls -lh big
-rw-r--r-- 1 roaima roaima 2.0G May 19 15:55 big
du -h big
0       big
I've always accepted that it will give me different answers to ls, and that's fine because they're measuring different things.
Now I have a cloud based filesystem where I get charged not only for storage but also each time I download data, so I need to minimise the amount of data accessed by general housekeeping activities such as "how much disk space is used in this tree?"
I'm not aware of a library/system call to tell me the number of used blocks, although there could easily be one. I don't believe du reads its way through every file it's considering because that doesn't differentiate between a file filled with zeros and one that's truly sparse.
So, how does du count blocks used?
 
     
    
stat(1)regularly but had not realised it could give me blocks used. (I guess I wasn't looking for it so just didn't see it...) – Chris Davies May 19 '16 at 15:09du's output using `lsby doingls -lsh big` which will print the blocks in the first column. – forquare May 19 '16 at 15:20