How I/O channels are implemented in Linux kernel?

Question

stdin, stdout, stderr are some integers that index into a data structure which 'knows' which I/O channels are to be used for the process. I understand this data structure is unique to every process. Are I/O channels nothing but some data array structures with dynamic memory allocation ?

by I/O channels do you mean streams or pipes? also, this probably varies by kernel. you'll probably need to ask about a specific kernel. — strugee, Oct 19 '13 at 00:06
@strugee I am talking about Linux kernel. I meant streams by I/O channels. How these streams are implemented in linux? some array or something? — KawaiKx, Oct 19 '13 at 00:23

score 15 · Accepted Answer · edited May 23 '17 at 12:40

In Unix-like operating systems, the standard input, output and error streams are identified by the file descriptors 0, 1, 2. On Linux, these are visible under the proc filesystem in /proc/[pid]/fs/{0,1,2}. These files are actually symbolic links to a pseudoterminal device under the /dev/pts directory.

A pseudoterminal (PTY) is a pair of virtual devices, a pseudoterminal master (PTM) and a pseudoterminal slave (PTS) (collectively referred to a s a pseudoterminal pair), that provide an IPC channel, somewhat like a bidirectional pipe between a program which expects to be connected to a terminal device, and a driver program that uses the pseudoterminal to send input to, and receive input from the former program.

A key point is that the pseudoterminal slave appears just like a regular terminal, e.g. it can be toggled between noncanonical and canonical mode (the default), in which it interprets certain input characters, such as generating a SIGINT signal when a interrupt character (normally generated by pressing Ctrl+C on the keyboard) is written to the pseudoterminal master or causing the next read() to return 0 when a end-of-file character (normally generated by Ctrl+D) is encountered. Other operations supported by terminals is turning echoing on on or off, setting the foreground process group etc.

Pseudoterminals have a number of uses:

They allow programs like ssh to operate terminal-oriented programs on a another host connected via a network. A terminal-orientated program may be any program, which would normally be run in an interactive terminal session. The standard input, output and error of such a program cannot be connected directly socket, as sockets do not support the aforementioned terminal-related functionality.
They allow programs like expect to drive a interactive terminal-orientated program from a script.
They are used by terminal emulators such as xterm to provide terminal-related functionality.
They are are used by programs such as screen to multiplex a single physical terminal between multiple processes.
They are used by programs like script to to record all input and output occuring during a shell session.

Unix98-style PTYs, used in Linux, are setup as follows:

The driver program opens the pseudo-terminal master multiplexer at dev/ptmx, upon which it receives a a file descriptor for a PTM, and a PTS device is created in the /dev/pts directory. Each file descriptor obtained by opening /dev/ptmx is an independent PTM with its own associated PTS.
The driver programs calls fork() to create a child process, which in turn performs the following steps:
- The child calls setsid() to start a new session, of which the child is session leader. This also causes the child to lose its controlling terminal.
- The child proceeds to open the PTS device that corresponds to the PTM created by the driver program. Since the child is a session leader, but has no controlling terminal, the PTS becomes the childs controlling terminal.
- The child uses dup() to duplicate the file descriptor for the slave device on it standard input, output, and error.
- Lastly, the child calls exec() to start the terminal-oriented program that is to be connected to the pseudoterminal device.

At this point, anything the driver program writes to the PTM, appears as input to the terminal-orientated program on the PTS, and vice versa.

When operating in canonical mode, the input to the PTS is buffered line by line. In other words, just as with regular terminals, the program reading from a PTS receives a line of input only when a newline character is written to the PTM. When the buffering capacity is exhausted, further write() calls block until some of the input has been consumed.

In the Linux kernel, the file related system calls open(), read(), write() stat() etc. are implemented in the Virtual Filesystem (VFS) layer, which provides a uniform file system interface for userspace programs. The VFS allows different file system implementations to coexists within the kernel. When userspace programs call the aforementioned system calls, the VFS redirects the call to the appropriate filesystem implementation.

The PTS devices under/dev/pts are managed by the devpts file system implemention defined in /fs/devpts/inode.c, while the TTY driver providing the the Unix98-style ptmx device is defined in in drivers/tty/pty.c.

Buffering between TTY devices and TTY line disciplines, such as pseudoterminals, is provided a buffer structure maintained for each tty device, defined in include/linux/tty.h

Prior to kernel version 3.7, the buffer was a flip buffer:

#define TTY_FLIPBUF_SIZE 512

struct tty_flip_buffer {
        struct tq_struct tqueue;
        struct semaphore pty_sem;
        char             *char_buf_ptr;
        unsigned char    *flag_buf_ptr;
        int              count;
        int              buf_num;
        unsigned char    char_buf[2*TTY_FLIPBUF_SIZE];
        char             flag_buf[2*TTY_FLIPBUF_SIZE];
        unsigned char    slop[4];
};

The structure contained storage divided into two equal size buffers. The buffers were numbered 0 (first half of char_buf/flag_buf) and 1 (second half). The driver stored data to the buffer identified by buf_num. The other buffer could be flushed to the line discipline.

The buffer was 'flipped' by toggling buf_num between 0 and 1. When buf_num changed, char_buf_ptr and flag_buf_ptr was set to the beginning of the buffer identified by buf_num, and count was set to 0.

Since kernel version 3.7 the TTY flip buffers have been replaced with objects allocated via kmalloc() organized in rings. In a normal situation for an IRQ driven serial port at typical speeds their behaviour is pretty much the same as with the old flip buffer; two buffers end up allocated and the kernel cycles between them as before. However, when there are delays or the speed increases, the new buffer implementation performs better as the buffer pool can grow a bit.

Answers as knowledgeable as this are extremely hard to come by. — étale-cohomology, May 06 '18 at 23:55

score -1 · Answer 2 · answered Oct 19 '13 at 05:14

From the man pages for any of the three it explains the answer:

   Under  normal circumstances every UNIX program has three streams opened
   for it when it starts up, one for input, one for output,  and  one  for
   printing diagnostic or error messages.  These are typically attached to
   the user's terminal but might instead  refer  to  files  or
   other  devices,  depending  on what the parent process chose to set up.

   The input stream is referred to as "standard input"; the output  stream
   is  referred  to as "standard output"; and the error stream is referred
   to as "standard error".  These terms are abbreviated to form  the  sym-
   bols used to refer to these files, namely stdin, stdout, and stderr.

   Each  of these symbols is a stdio(3) macro of type pointer to FILE, and
   can be used with functions like fprintf(3) or fread(3).

   Since FILEs are a buffering wrapper around UNIX file  descriptors,  the
   same  underlying  files  may  also  be accessed using the raw UNIX file
   interface, that is, the functions like read(2) and lseek(2).

   On program startup, the integer file descriptors  associated  with  the
   streams  stdin,  stdout, and stderr are 0, 1, and 2, respectively.  The
   preprocessor symbols STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO are
   defined  with  these values in <unistd.h>.

This answer describes the implementation of stdin, stdout and stderr from the point of view of the C library, but the question is explicitly about the kernel implementation. I tried to account for the kernel point of view in my answer. — Thomas Nyman, Nov 27 '13 at 14:12

How I/O channels are implemented in Linux kernel?

2 Answers2

Linked