Reading a named pipe: tail or cat?

Question

I made a file descriptor using

mkfifo fifo

As soon as something is written to this pipe, I want to reuse it immediately. Should I use

tail -f fifo

or

while true; do cat fifo; done

?

They seem to do the same thing and I could not measure a difference in performance. However, when a system does not support inotify (Busybox, for example), the former needs to be

tail -f -s 0 fifo

But this eats up the CPU with 100% usage (test it out: mkfifo fifo && busybox tail -f -s 0 fifo & echo hi>fifo / cancel with fg 1 and CtrlC). So is the while-true-cat the more reliable solution?

Pipes are not seekable, and tail -f needs seekable input. The behavior of tail -f fifo is undefined. If tail -f -s 0 fifo works on some systems those systems are relying on nonstandard behavior. — Satō Katsura, Sep 17 '17 at 07:58
@SatōKatsura, that's not true. tail works fine on fifos and is required to. -s is not standard though. — Stéphane Chazelas, Sep 17 '17 at 12:56
@StéphaneChazelas tail alone works fine, but tail -f needs seekable input. — Satō Katsura, Sep 17 '17 at 14:47
@SatōKatsura no. It doesn't. There's no reason why it would want to seek. After it has output the last 10 lines, it just sits there waiting for more input. — Stéphane Chazelas, Sep 17 '17 at 17:13
@StéphaneChazelas It needs seekable input to recover from input errors. Among other things, tail -f fifo is supposed to be still reading after echo foo >fifo. — Satō Katsura, Sep 18 '17 at 09:22
@SatōKatsura, I suggest you read the POSIX specification and the source code of one or two implementations. You can't have I/O error on a pipe. You don't want tail to recover (whatever that means) from an I/O error but bail out with an error message. tail doesn't need to seek to continue reading after echo foo > fifo (after which tail would see eof, not an I/O error), it just needs to sleep(1) and do a read() again afterwards (in a loop) (on Linux, it can avoid the polling with inotify), like for reg files. — Stéphane Chazelas, Sep 18 '17 at 09:34

Stéphane Chazelas · Accepted Answer · 2017-09-18T09:44:08.410

When you do:

cat fifo

Assuming no other process has opened the fifo for writing yet, cat will block on the open() system call. When another process opens the file for writing, a pipe will be instantiated and open() will return. cat will call read() in a loop and read() will block until some other process writes data to the pipe.

cat will see end-of-file (eof) when all the other writing processes have closed their file descriptor to the fifo. At which points cat terminates and the pipe is destroyed¹.

You'd need to run cat again to read what will be written after that to the fifo (but via a different pipe instance).

In:

tail -f file

Like cat, tail will wait for a process to open a file for writing. But here, since you didn't specify a -n +1 to copy from the beginning, tail will need to wait until eof to find out what the last 10 lines were, so you won't see anything until the writing end is closed.

After that, tail will not close its fd to the pipe which means the pipe instance won't be destroyed, and will still attempt to read from the pipe every second (on Linux, that polling can be avoided via the use of inotify and some versions of GNU tail do that there). That read() will return with eof (straight away, which is why you see 100% CPU with -s 0 (which with GNU tail means to not wait between read()s instead of waiting for one second)) until some other process opens the file again for writing.

Here instead, you may want to use cat, but make sure the pipe instance always stays around after it has been instantiated. For that, on most systems, you could do:

cat 0<> fifo # the 0 is needed for recent versions of ksh93 where the
             # default fd changed from 0 to 1 for the <> operator

cat's stdin will be open for both reading and writing which means cat will never see eof on it (it also instantiates the pipe straight away even if there's no other process opening the fifo for writing).

On systems where that doesn't work, you can do instead:

cat < fifo 3> fifo

That way, as soon as some other process opens the fifo for writing, the first read-only open() will return, at which point the shell will do the write-only open() before starting cat, which will prevent the pipe from ever being destroyed again.

So, to sum up:

compared to cat file, it would not stop after the first round.
compared to tail -n +1 -f file: it would not do a useless read() every second after the first round, there would never be eof on the one instance of the pipe, there would not be that up to one second delay when a second process opens the pipe for writing after the first one has closed it.
compared to tail -f file. In addition to the above, it would not have to wait for the first round to finish before outputting something (only the last 10 lines).
compared to cat file in a loop, there would be only one pipe instance. The race windows mentioned in ¹ would be avoided.

¹ at this point, in between the last read() that indicates eof and cat terminating and closing the reading end of the pipe, there is actually a small windows during which a process could open the fifo for writing again (and not be blocked as there's still a reading end). Then, if it writes something after cat has exited and before another process opens the fifo for reading, it would get killed with a SIGPIPE.

awesome. you are good at explaining things. about cat <>fifo: cat will never see eof on it. I understand the instance stays open. But if another process writes an EOF to it, it should also arrive at cat, right? So it just prefers to ignore it since there is bidirectional piping going on? Thanks! — phil294, Sep 17 '17 at 23:42
@Blauhirn, there's no such thing as "writing an EOF to a pipe". You see EOF on a pipe when all the writers are gone, period. Maybe you're thinking of the ^D termios setting. But that only applies to I/O to terminal devices, that is serial/pty character devices that have a tty line discipline attached to them (and only in some mode of operation), not pipes, not socketpairs, not regular files, not directories, not block devices, not other character devices. — Stéphane Chazelas, Sep 18 '17 at 06:37
@Blauhirn (continued), by opening the fifo in read+write mode, we have a fd to both ends of the pipe (not bidirectional; even on systems where regular pipes are bidirectional, named pipes have only one direction), so as there's always a fd at the other end (the same fd as the one we're reading from), we'll never see eof, as there's still a writer at the other end: us. — Stéphane Chazelas, Sep 18 '17 at 06:39

jimmij · Answer 2 · 2017-09-17T01:25:33.697

2

Let me propose another solution. Pipe will be available for reading as long as some process will write on the second end. So you can create some fake cat in the background (or in another terminal), for example:

mkfifo fifo
cat >fifo &
cat fifo

Now you can write to fifo as long as you want, and when finish just kill current cat with C-c, and then fg to bring first cat from the background and finally C-d to stop it.

edited Sep 17 '17 at 01:25

answered Sep 17 '17 at 01:20

jimmij

47,140

1

In POSIX shells and when not interactive, when you start a command in background, its stdin becomes /dev/null, so cat will exit just after starting. In shells that don't do that, and/or if run from an interactive shell in a terminal, cat will likely be suspended as it attempts to read from the terminal while not being in the foreground. – Stéphane Chazelas Sep 17 '17 at 13:04

Reading a named pipe: tail or cat?

2 Answers2

Linked