1

I struggle to understand the effects of the following command:

yes | tee hello | head

On my laptop, the number of lines in 'hello' is of the order of 36000, much higher than the 10 lines displayed on standard output.

My questions are:

  • When does yes, and, more generally, a command in a pipe, stop?

  • Why is there a mismatch between the two numbers above. Is it because tee does not pass the lines one by one to the next command in the pipe?

2 Answers2

5
:> yes | strace tee output | head
[...]
read(0, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
write(3, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
read(0, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=5202, si_uid=1000} ---
+++ killed by SIGPIPE +++

From man 2 write:

EPIPE
fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive a SIGPIPE signal.

So the processes die right to left. head exits on its own, tee gets killed when it tries to write to the pipeline the first time after head has exited. The same happens with yes after tee has died.

tee can write to the pipeline until the buffers are full. But it can write as much as it likes to a file. It seems that my version of tee writes the same block to stdout and the file.

head has 8K in its (i.e. the kernel's) read buffer. It reads all of it but prints only the first 10 lines because that's its job.

Hauke Laging
  • 90,279
  • 1
    I think the 8 k here is the internal buffer used by tee, not the kernel's buffer. If we do something like yes | strace tee output | (sleep 1; head), we'll see that tee writes more than that to the pipe before blocking on the write, 64 k on my system (that seems to be the pipe buffer size according to the man page). In the non-sleep case, it's just that head gets to run immediately, and closes the pipe. – ilkkachu Jan 14 '18 at 15:54
  • 1
    The stdout of tee is unbuffered. There are two sets of buffers downstream of it, a kernel buffer that comprises the pipe and the stdin buffering in the standard library of the head process. ikkachu's command line, which I was just about to suggest, demonstrates that the pipe buffer itself can take more than 8KiB. The 8KiB gulp that head takes is the GNU C run-time library filling up the internal buffer in the stdin stream. (Run this on a BSD, and you'll see the BSD C RTL using different stdin buffer sizes.) – JdeBP Jan 14 '18 at 16:09
0

A program which writes to a pipe will receive a SIGPIPE signal when the pipe reader terminates and tee(1) will not terminate as long as its standard input stays open.

The head(1) outputs 10 lines by default.

  • 1
    "tee(1) will not terminate as long as its standard input stays open" – That is not correct; see the strace output in my answer. Different versions of tee may behave differently, though. – Hauke Laging Jan 14 '18 at 15:33