A file descriptor is an integer used to reference a file, among all files opened by a given process. Usually, this is implemented by kernels by considering the file descriptor as an index in a table.
The rest of my answer applies to Linux.
In Linux, each valid file descriptor is associated to a struct file
. This structure contains a pointer to the inode (the file's data and metadata), the current position of the process in the file, a list of operations which are actually pointers to functions implemented by the file system the file lives in, etc.
To fetch the file
structure from the file descriptor, the Linux kernel proceeds as follows. I take here the example of the read
system call.
SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
if (f.file) {
loff_t pos = file_pos_read(f.file);
ret = vfs_read(f.file, buf, count, &pos);
if (ret >= 0)
file_pos_write(f.file, pos);
fdput_pos(f);
}
return ret;
}
The first operation is fdget_pos
. It takes as parameter the file descriptor from the caller in userspace and fetches the corresponding file
. It returns a struct fd
defined as follows:
struct fd {
struct file *file;
unsigned int flags;
};
This is basically a struct file
, with a couple flags to remember what operations will be necessary on putting back the structure.
Now, how does fdget_pos
works. It's actually intricate in strange ways but it boils down to two basic operations (with more checks that I don't show here for simplicity):
The first one consists in fetching the process's files table. This table is available from a pointer in the caller process's structure (accessible through current
):
struct files_struct *files = current->files;
The next operation consists in verifying the validity of the file descriptor:
if (fd < files->fdt->max_fds) // first of all, if the file descriptor is too big, then it cannot be valid
return files->fdt->fd[fd]; // otherwise, we return the pointer stored in the table of file descriptors (may be NULL)
return NULL;
The pointer may be eliminated before the function returns (if one thread of the process does a read
and another a close
on the same file descriptor at the same time, for instance). The kernel takes care of this.
If the struct file
pointer returned by fdget_pos
is NULL
, then it means that the file descriptor passed to the system call is invalid. In this case, the system call returns the error code EBADF
("bad file descriptor").
To sum up, file descriptors are just indexes in a per-process table of file descriptors. However, it's not sufficient to just dereference them, since the entry in the files table may be NULL
. Furthermore, the kernel must do additional checks to handle race conditions on the file descriptor.
/proc/<pid>/fd
I assumed that fds are not global, but instead each process has it's own set (table) of file descriptors. I am aware that the kernel creates the fds, after all this is done with a syscall. – Iulian Paun Mar 29 '17 at 07:31Where else would the kernel look other than the current process’s list of file descriptors?
Well, I can't say that I know; just wanted to be sure. However, thanks to @lgeorget some things are clearer now.
– Iulian Paun Mar 31 '17 at 11:55