Since disk space is allocated in blocks, is it a more accurate representation of the actual space consumed by a directory to report it in blocks vs bytes?
If a file of size 1,025 bytes resides on a file system where space on the file system is doled out in units of 1,024 byte blocks, that file consumes two whole blocks. That seems more accurate than to say that this file consumes 1,025 bytes of space.
Edit: File system in question is ext4, no dedupe, no compression, fwiw.
This is my attempt:
def getDirUsage(filepath, block_size=1024): # block_size as reported by os.statvfs()
'''
return the number of blocks consumed by a directory
'''
total_size = int(math.ceil(os.path.getsize(filepath)/block_size)) # debatable whether this should be included in the size
allfiles = os.listdir(filepath)
for f in allfiles:
p = os.path.join(filepath,f)
if os.path.isdir(p):
total_size += getDirUsage(p,block_size)
else:
total_size += int(math.ceil(os.stat(p).st_size/block_size))
return total_size