3

Pardon if this question has an exact duplicate somewhere else, but so far all of the answers I have found on SE or other sites in general do not answer this question specifically. I am taking an operating systems course in my college and hence I am pretty new to file systems in general.

I understand that in most file systems, there is a root directory which contains file directory entries. These entries contain a mapping from filename to inode number, and are variable size in length.

According to this answer, I guess these entries are stored in a linear fashion, like below:

I can fully understand what inodes are and how they map to a file's data block numbers on the physical disk, using their Table of Contents (TOC) entries.


However, my question is: How and where are subdirectory file directory entries stored?

I would believe that they are either stored in the same location as the root directory, at some offset. However, I cannot envision how this offset can be retrieved from the inode.

Hence, I have a feeling that the directory entries of subdirectories are actually stored in the data region of the disk, instead of with the root directory's entries.

Hence, if this is the case, traversing from one directory to another requires the disk to read from seemingly arbitrary locations, which seems a little inefficient to me.

Nevertheless, I would like to simply clear up my misconceptions on the location of the file directory entries of a subdirectory.

Much help is appreciated.

Irvin Lim
  • 133
  • 1
  • 5
  • I think this may differ between specific file system implementations. Which file system are you interested in? – Kusalananda Apr 23 '17 at 09:00
  • @Kusalananda I am actually taking an operating systems course in college. I'm interested to find out how it is most commonly implemented, if there is a common trend. Otherwise, perhaps I would be interested in the more popular file systems, e.g. EXT, FAT, and NTFS. – Irvin Lim Apr 23 '17 at 09:02

2 Answers2

5

Directories are usually implemented as files. They have an inode, and a data area, but of course are usually accessed (at least written to) by special system calls. Some systems allow for reading directories with the usual read(2) system call (Linux doesn't, FreeBSD did when I last checked). The data area of the directory-file then contains the directory entries. On ext4, the root directory also has an inode, it's fixed to inode number 2 (try ls -lid /).

Having the directory act like a file makes it easy to allocate space for the directory entries, etc, as the functions to allocate blocks for files must always be there. Also, since they use the same data blocks as needed, there's no need to allocate space between file data and directory listings beforehand.

The internals of how directory entries are stored varies between file systems, and has for example evolved between ext2 and ext4. Modern systems use trees instead of linear lists for faster lookups. See here. Even the venerable FAT filesystem stores directories as files, but at least in older FATs, the root directory is special. (The structure of the directory entries in FAT is of course different from Unix filesystems.)

Hence, if this is the case, traversing from one directory to another requires the disk to read from seemingly arbitrary locations, which seems a little inefficient to me.

Yep. But often-accessed directory entries (or the underlying data blocks) are likely to be cached in modern operating systems.

Saving the contents of all directories centrally would require pre-allocating a large area, and would still require disk seeks within the directory data area.

ilkkachu
  • 138,973
  • Thanks for this, it really helped me out. So am I right to say that, the inode for a directory (e.g. /home/user) points to more directory entries (the filenames + inode numbers for all files under /home/user)? – Irvin Lim Apr 23 '17 at 10:22
  • 1
    On Unixy systems with separate inodes (and hard links), the directory entry points to an inode, which tells it's a directory instead of a regular file. The inode then points to the data blocks, which contain more directory entries, which point to inodes... – ilkkachu Apr 23 '17 at 10:31
1

The common solution is that some of the inodes in the root directory point to entries which are also directories. In many respects, they are just like files, but the file type indicates to the filesystem to interpret them as directories.

(In really old tutorials, like for the original Unix, you will even be told that you can cat a directory, too. This is generally no longer true.)

In other words, every directory is a simple linear list of inode pointers. Some of them point to leaf nodes in the directory tree (files), others point to internal nodes (another directory). The only things which are special about the root directory is that it is its own parent, and that there is something external to the tree which tells the system to start traversing the tree from here.

tripleee
  • 7,699