I was wondering why an empty directory occupied 4096 bytes of space and I have seen this question. It is stated that space is allocated in blocks and hence, the size of a new directory is 4096 bytes.
However I am pretty sure that allocation for "normal" files are done in blocks as well. At least it is like that in Windows filesystems and I am guessing that it must be at least similar in ext*.
Now as far as I understood, size listing for other types of files, such as files, symbolic links etc. are done in terms of real size. Because when I create an empty file, I see a 0 as the size. When a type a few characters, I see the < number of characters > bytes as the size etc.
So my question is, although the allocation for other files are done in blocks too, why the policy for reporting the size of a directory and a file differs?
Clarification
I thought the question was clear enough but apparently is wasn't. I will try to clarify the question here.
1) What I think a directory is:
I will try to explain what I think a directory is by the following example. After reading, if it is wrong, please notify me.
Let's say that we have a directory named mydir
. And let's say that it contains 3 files, which are: f0
, f1
and f2
. Let's assume that each file is 1 byte long.
Now, what is mydir
? It is a pointer to an inode which contains the following: String "f0" and the inode number which f0
points to. String "f1" and the inode number which f1
points to. And string "f2" and the inode number which f2
points to. (At least this is what I think a directory is. Please correct me if I am wrong.)
Now there may be two methods for calculating the size of a directory:
1) Calculating the size of the inode which mydir
points to.
2) Summing the sizes of the inodes which contents of mydir
points to.
Although 1 is more counter intuitive, let's assume that it is the method that is being used. (For this question, which method is the method that is actually being used does not matter.) Then, the size of mydir
is calculated as the following:
2 + 2 + 2 + 3 * <space_required_to_store_an_inode_number>
2's are because each filename is 2 bytes long.
2) The question:
Now the question: Assuming what I think a directory is correct, the reported size for mydir
should be much much less than 4096, no matter method 1 or method 2 is being used to calculate its size.
Now, you will say that the reason it is reported 4096 bytes is because the allocation is done in blocks. Hence, the reported size that big.
But then I will say: Allocation is done in blocks for regular files as well. (See thrig's answer for reference) But nevertheless, their sizes are reported in real sizes. (1 byte if they contain 1 character, 2 bytes if they contain 2 characters etc.)
So my question is, why is the policy for reporting sizes of directories is such different than reporting sizes of regular files?
More clarification:
We know that the initial number of blocks allocated for a non-empty file and for an empty directory is both 8 blocks. (See thrig's answer) So even though allocation is made in the same number of blocks for both regular files and directories, why the reported size for a directory is much bigger?
mydir
. And let's say it contains some files such as:f0
,f1
andf2
. Now, what ismydir
? It is a pointer to an inode which contains the following: String "f0" and inode number which it points to. String "f1" and inode number which it points to. String "f2" and inode number which it points to. (At least this is the picture in my mind. It might be wrong) So far so good. – Utku Oct 06 '15 at 15:24mydir
points to. Not adding the sizes of the inodes which contents of the directory points to. The other way might be defining as the sum of the sizes of inodes which are pointed by the directory's contents. For simplicity, if we assume that it is calculated w.r.t the former definition, the size ofmydir
should be: 2 + 2 + 2 + 3*mydir
is two characters long.In other words, there is no guarantee in the general case that the file "owns" the whole block. The only size "owned" by the file is the actual content (not the inode).
– madumlao Oct 06 '15 at 16:42foofile
is 8, as soon as something is written infoofile
. This is the same number of allocated blocks for a directory. Now according to what you say, a directory owns each and every byte of these 8 blocks and hence, its size is 4096 (or 512 in thrig's case) bytes. But this is not the case forfoofile
. Then why isfoofile
assigned 8 blocks, even if it doesn't own every byte of it? – Utku Oct 06 '15 at 17:05info ls
, I see the following for-s
option: "Display the number of file system blocks actually used by each file, in units of 512 bytes, where partial units are rounded up to the next integer value. If the output is to a terminal, a total sum for all the file sizes is output on a line before the listing. The environment variable BLOCKSIZE overrides the unit size of 512 bytes." – Utku Oct 06 '15 at 17:43-s
must be working differently but I don't understand how it works. – Utku Oct 06 '15 at 17:46info ls
documentation is much clearer as it only talks about "disk allocation", without file size calculations. I suspect that it's just the documentation making up for Unix madness.
– madumlao Oct 06 '15 at 18:02ls -s
should be showing the same info asstat --format=%b
does, which has a field for the number of data blocks. Also check stat and you'll see there is a general case difference between allocated blocks and file size.