Can a file exist without filesystem?

Question

I was reading about filesystems and a few questions came up to mind.

Q. If files are an integral part of unix/linux (i.e. to represent processes in /proc or device files in /dev) as in a famous saying 'everything is a file', do they exist outside of the context of filesystem? I feel like some files such as network socket files or block device files are filesystem independent and more like part of the OS itself.

Follow-up Q. Can unix/linux function without filesystems? For example, can a Linux system work by accessing the secondary storage manually?

Linux accesses the disk "manually". It puts files on that disk. To remember where those files are, it uses a structure on the disk that is normally called "filesystem". A network socket does not live in a filesystem, therefore it is not a file. A UNIX socket, however, does and is. Same for a device file: It has a directory entry in a filesystem. Of course, the device that the device file stands for is not a file. — berndbausch, Feb 14 '21 at 14:39
This might be help helpful https://unix.stackexchange.com/q/507837/20140 — Philip Couling, Feb 14 '21 at 22:07

ilkkachu · Accepted Answer · 2021-02-15T11:44:43.010

Yes. And no. Maybe.

Not everything is a file; obviously a hard drive can't contain a partition that then contains a filesystem, that then contains the hard drive itself. It's just that a number of things are accessible through names visible through the filesystem tree.

As far as the filesystem is concerned (the logical filesystem, or a concrete one in the data structure sense, like ext4), it's just that some files are marked as "device nodes" for some particular numbered device. But their functionality is implemented by separate drivers. When a process access them, the OS just diverts the access to the appropriate driver, not to the filesystem. Think of them as references, or pointers, or such.

With that in mind, it's easy to understand that e.g. /proc is in no way mandatory. The system can function and it can run processes even without it. You just won't have that method of viewing those processes. But stuff like fork(), kill() and wait*() will still work, since they refer to processes by PID, not by some filesystem name. Network sockets also don't appear as named files, in general. Unix domain sockets can do that, but IIRC they don't have to. And TCP or UDP sockets etc. just don't. But network sockets do appear as file descriptors to processes, and the read() and write() system calls work with them the same as with pipes or "real" files. So in some sense, network sockets also walk, talk and quack like files, even though the network protocols have not much to do with storing bits on a disk.

But, as far as I can think of, you don't really have a way to refer to an arbitrary hard drive without those named device nodes. Your hardware still exists, the system still has the necessary SATA/USB/whatever drivers required to work with them, but you have no way of telling it to do so. Though you could mount a filesystem, and then remove the device node pointing to the device the filesystem was on. There's no problem here, since the device node is just a way for userspace to access the device.

You asked, "Can unix/linux function without filesystems?". Linux doesn't run without a file system, for one, because it starts userspace by looking for an executable file to run (eventually running what stays around as init). That filesystem doesn't need to be one on a regular disk drive, though, it can be the special rootfs filesystem the kernel sets up from data included with the kernel image. (Incidentally, you can't get rid of rootfs. It's always there, even if empty, so that the kernel doesn't need to deal with the idea of not having any mounted filesystems.) See ramfs-rootfs-initramfs.txt in the kernel documentation for the details on rootfs.

I suppose we could assume some hypothetical OS that could function without the filesystem, but e.g. the execve() system call takes a file name, so whatever was running couldn't launch other programs (the one running would need to become loaded some other way), and without the named device nodes, accessing storage would also be hard. It wouldn't look a lot like other Unixen anyway.

On Linux, it might be possible to engineer an oddball single-purpose system that would launch a single userspace program from rootfs at bootup, then clear up rootfs and never mount any other filesystems. That would get as close to having no filesystem as possible, and the program could still run and e.g. access the network. I doubt it would have any practical use, though, and as usual, any open files would still exist until closed, so removing their names might not be very useful.

See also Does the Linux kernel need a file system to run?, parts of the answers to which I echoed above. For a longer discussion on that "everything is a file" mantra, see this answer on A layman's explanation for "Everything is a file" — what differs from Windows?.

@PhilipCouling don't ever mention fexecve(2) -- that may give them ideas. — , Feb 15 '21 at 03:31
@PhilipCouling, well, I edited anyway to make clear that I meant a hypothetical system there. And to add a link to that Q, thanks. — ilkkachu, Feb 15 '21 at 11:45
@user414777, how about just building the programs in memory with mmap() and manually copying the code in place... Hold on, Linux has memfd_create(), can you execve() a file created with it? (You're right, you did give me ideas.) — ilkkachu, Feb 15 '21 at 11:47
@user414777 hehe yes. Although where would you get the fd from? — Philip Couling, Feb 15 '21 at 15:17

score 3 · Answer 2 · answered Feb 14 '21 at 17:59

3

The computer concept of “file” actually predates file systems. It was a collection of data (such as on punch cards).

Modern systems now really only consider a file as part of a file system, as other people have answered. There is metadata associated with the files, such as ownership and permissions, that aren’t part of the data in the file. But as the Wikipedia pages shows, it hasn’t always been the case.

Also, block device files and socket files are filesystem implementations that represent OS objects, they are by definition something that require a file system. The OS objects they refer to don’t need them, the files are just interfaces to the objects. Not all file systems support block devices and socket files, too.

answered Feb 14 '21 at 17:59

jsbillings

24,406

I've been pondering your last paragraph with regards to sockets and pipes. Devices obviously exist apart from the filesystem and the major/minor numbers of device files just map onto OS drivers. So you can duplicate a device file on different file systems, they will refer to the same device. I don't believe you can duplicate sockets and pipes on other file systems, they will always end up referring to a different socket. So on some level sockets are bound to their file system. I'm unclear on unlinking in-use socket files. I have no idea if they leave a file-system artifact. – Philip Couling Feb 15 '21 at 11:57
1

@PhilipCouling, on Linux, it looks like having a device node or socket open prevents the fs from being unmounted, even if you unlink the node/socket first. – ilkkachu Feb 15 '21 at 13:15
@ilkkachu yes, this phrase doesn't feel correct for sockets: "The OS objects they refer to don’t need them, the files are just interfaces to the objects". Your point about unmounting being blocked speaks to that. Though it might be otherwise explained by FDs requiring an inode provided by the filesystem (I think here) – Philip Couling Feb 15 '21 at 15:15
the actual inode for the socket obviously needs a filesystem, but unix sockets don't have to use a file, the code to handle it lives in the kernel and you can use an unnamed or abstract pathname. – jsbillings Feb 15 '21 at 21:20
@jsbillings its true that you can create unamed sockets. Those are object that are never associated with a file system. The issue I'm choking on is whether or not a named socket could continue to exist in the absence of its file system. You've worded it as "The OS objects they refer to don’t need them" but in a very real way named sockets would cease to exist if the filesystem was removed: all related FDs would be forced to close and with no FDs left, the socket would disappear. Would they not? – Philip Couling Feb 15 '21 at 23:17

score 0 · Answer 3 · answered Feb 14 '21 at 17:29

This question is of rather philosophical nature. Can anything exist without an environment?

On the less philosophical view: It depends on your definition of „file“. Devices obviously exist without a file-system. Processes exist without a file-system. Network streams exist without a file-system.

For these things to do anything useful, they must have some kind of representation. Whether you have a device-id or an integer number to enumerate data streams does not matter.
Would you call the information „there is some data in hard-disk block 23865“ or „network stream 3874“ a „file“? Then you could say files can exist without a file-system.

You probably would want to have some means to store, access and manage these information. You would have a list of network streams or currently active processes. You store your „file“ in a larger data structure. So there it is, your file-system. While these data-structures are not written onto any form of persistent storage, they still are systems for managing files.

In conclusion: No, a file cannot exist without a file-system.

Can a file exist without filesystem?

3 Answers3

Linked