5

I've been trying to better understand how the linux filesystem works, looking at journaling, inodes, and access control list. In looking into this, I came across filesystems which don't seem to act how I would expect a filesystem to act, such as glusterfs and mergerfs. Instead of getting written to a hard drive similar to how mkfs.ext3 or mkfs.xfs would, they are run on top of the other filesystems. So both ext3 and mergerfs (or glusterfs) could be used with the same drive, which seems strange since I as far as I know, two filesystems can't be defined on the same partition.

Is my understanding of filesystems wrong, or is there something special about the mergerfs/glusterfs systems which distinguish them from ext3 or xfs?

James
  • 205

3 Answers3

17

The word "filesystem" is somewhat overloaded, and I think that might be confusing you. In one sense a "filesystem" is the format in which files are written to some medium (e.g., a partition on a disk). In another sense, a "filesystem" (or more specifically, a "Virtual Filesystem") is an abstraction provided by the OS that presents a set of files (regular files, directories, etc). An OS can read the on-disk filesystem and present a filesystem abstraction.

The files presented in the filesystem abstraction can be stored on disk (e.g., ext4), on some other host across the network (e.g., cifs, nfs), or elsewhere. Something like mergerfs takes multiple sources of files and presents them as if it was a single source. From their website "mergerfs logically merges multiple paths together. Think a union of sets."

Take a look at the mergerfs website, they have a nice description of what that does.

Andy Dalton
  • 13,993
  • 3
    Awesome, that distinguishing between a on-disk filesystem (ext3, ext4) and a virtual filesystem is exactly what I was looking for. Thank you. – James Jan 03 '18 at 00:46
  • Except that a file system is ALSO an abstraction for blocks of data. The VFS is much like what the internet is to a set of networks, it combines them into a common interface. – jdwolf Jan 03 '18 at 00:59
  • 1
    There are some dictionaries at https://unix.stackexchange.com/a/249964/5132 . (-: – JdeBP Jan 03 '18 at 07:17
2

Distributed Network file systems are more like a database than a file system and often live inside regular file systems to avoid complication / duplicate code.

user1133275
  • 5,574
2

With network-attached storage file systems it would be more accurate to say the file system is using the underlying file system to implement itself or more accurately its using the OSs virtual file system (VFS) which may store the data on disk using many different on-disk file systems.

This is just as much true of Samba or NFS when you mount them via the OS. However glusterfs also has the feature of combining many network attached storage together on its backend. I would call this a networked/union FS but the term they use is scale-out.

Unionfs, mergerfs, aufs and overlayfs are implementations of union mounting and exist as file systems in the VFS of Linux but do not on-disk. Their function is to combile files from several underlaying file systems and present them as another file system on the VFS. They can be passed via mount or fstab but they won't appear in the on-disks fs super block.

So at no point are any on-disk file systems existing as some multiple of file systems as you mentioned. Although ironically such file systems exist. For example an ext3 file system IS an ext4 file system until an ext4 specific feature is used.

jdwolf
  • 5,017
  • On that last part, you mean to say that ext4 is identical to ext3 until a ext4 feature is used? (Just to confirm, I don't understand those filesystems fully yet) – James Jan 03 '18 at 00:47
  • @Hidden14 It is semantically. You can for example mount an ext3 as ext4. – jdwolf Jan 03 '18 at 00:56