3

Can someone explain folder size? Folder contains only 1 file with 360Gb size. ls and du show used 440Gb

[root@liz DECSC]# ls -lha
total 440G
drwxrwxr-x  2 geo geo  4.0K Dec  6 19:56 .
drwxrwxr-x 14 geo geo  20K  Dec  6 19:39 ..
-rwxrwxrwx  1 geo geo  360G Apr  8 2018 vor_gainzp2.dat
[root@liz DECSC]# du -hs
440G    .
[root@liz DECSC]# 

1 Answers1

2

It sounds like this is on a filesystem that has direct block allocation and not extents, such as ext3.

This means that each data block has an entry in a table (in the inode). The first 12 entries are direct blocks, i.e. those directly point to data blocks. The next entry is an indirect block, which points to a block which again contains block numbers. The 13th block is a double indirect block, and the 14th block is a triple indirect block.

All this means that for large files such as your 370GB file, there are an enormous amount of blocks involved in addressing all the data blocks. This is probably where the difference comes from; du takes into account all blocks, not just data blocks. ls shows the file size, but the total space in the directory is again shown with all space used, not just the data blocks.

For larger files, I do not recommend using ext3 and certainly not ext2. Use a modern extent-based filesystem such as ext4. With an extent-based filesystem, the blocks are indexed as "the first block is at 3874 and this extends for 342 blocks", and more extents are added as necessary. In this way many blocks can be found using just two numbers. This is not only much more space-efficient, it's also a lot faster as all those extra blocks don't need to be loaded.

wurtel
  • 16,115
  • There is no indication that the user is using Linux, although this is a resonable assumptions most of the time. – Kusalananda Dec 07 '18 at 08:06
  • The user is at least using GNU coreutils as most versions of ls don't support -h to show "human" numbers. The filesystem differences also apply to other unicen although they have different names. – wurtel Dec 07 '18 at 08:10
  • BSD ls also has -h, but the size of . and the spacing between columns in the output may be giveaways. – Kusalananda Dec 07 '18 at 08:13
  • server os is CentOS. Storage is NAS (Panasas ActiveStor), and i dont know what filesystem at backend. So this size 440Gb is show real used space on storage? – user3630995 Dec 07 '18 at 10:29