I'm using EXT4, but I think my question concerns all Unix/Linux file systems.
This 2022 answer to "Difference between fragment and extent in ext4" states that:
External fragmentation occurs when you have related files all in one directory that are scattered all over the disk. If you are trying to read every file in the directory, this can case as much performance issues as internal fragmentation does. The ext4fs algorithms also attempt to minimize external fragmentation by attempting to allocate blocks for files in the same cylinder group as other files in the same directory.
I wonder if that's true, because:
I couldn't find any other source to corroborate that. Web search results on "external fragmentation" usually point to RAM fragmentation, while specifying "ext4 external fragmentation" brings up some answers, but usually old (<2010)
An Arch Linux forum post from 2009 gives a radically different definition of "external fragmentation" in a FS context:
There are two types of fragmentation, i.e. internal and external. Internal fragmentation refers to the fact that a file system uses specific sizes for a block, say 4KB, so if you have a file which is only 1KB in size, it will be stored in one 4KB block, therefore wasting 3KB of the block. This can't really be avoided.
External fragmentation is when the files are not layed out continuously, i.e. spread over different blocks which can be far apart from each others. Thus it takes the disk head more time to collect all pieces together and reconstruct the file.
My opinion so far is that :
The previously quoted StackExchange answer from 2022 is completely wrong
The definition of the second quote is the right one:
External fragmentation is when the files are not layed out continuously, i.e. spread over different blocks which can be far apart from each others.
And there is no such thing as "attempting to allocate blocks for files in the same cylinder group as other files in the same directory" (excerpt from the first quote). Basically, if a FS (or an OS) attempted to group files of a same directory on the disk, it would conflict with the fact that usually a FS (at least in the case of EXT4) tries to surround a file with a lot of free space, to prevent file fragmentation in case of a future expansion of the file.
Could someone please confirm that my conclusions are correct (and thus that the quoted Stack Exchange answer is wrong)?
[EDIT]
After some more research, I came to the conclusion that the terms "external" and "internal" fragmentation have never been formally defined in the context of file systems. A few sources refer to them in the sense used in this Arch Linux post from 2009 or this kernel.org wiki entry, while some (even fewer) sources refer to them like in this StackExchange post from 2022.
s/human/data/g
). Write-behind from cache will also have beneficial optimisation effects. – Paul_Pedant Jan 02 '24 at 11:23