16

If I have a pipe:

process1 | process2

And process 1 generates gigabytes and gigabytes of data very quickly, but process2 needs to send that data across a network, and is therefore much slower, does this:

  • Slow down process1's execution?; OR
  • Buffer the data somewhere, until process 2 can read it?

If the data is being buffered, is it by the kernel? Is it in memory, or on disk? How big is this buffer? What happens when the buffer overflows?

John
  • 467

1 Answers1

25

The slower process limits the speed of the faster process. A pipe is a buffer (512 to 64k bytes in size, depending on the kernel and any adjustments by the process using fcntl(2) operations) between a process that's putting bytes into the buffer with write() calls, and another process that's pulling bytes out of the buffer with read() calls.

Both the write() and read() calls pass control to the kernel. So if the reading process calls read() on an empty pipe buffer, the kernel won't return from the read() until the other process puts bytes in (or closes its stdout file descriptor). At the other end, if the writing process calls write() on a full pipe buffer, the kernel won't add the new bytes to the buffer and return from the write() until the other process pulls bytes out (or closes its stdin file descriptor).

So the effect is that the faster process's performance is constrained by the performance of the slower process. The kernel doesn't allow the pipe buffer to overflow, nor to underflow.

Sotto Voce
  • 4,131
  • Note that the pipe buffer size varies wildly from kernel to kernel. "Around 1k" is right, but the real range is 512b - 64k, looking at various versions of BSD and Linux. I recall some older unixes use the page size for pipe buffer size, but this actually seems uncommon today. – user10489 Oct 14 '22 at 04:12
  • 2
    On modern-ish Linux, the default size is 64KiB on most hardware but can be modified, see man 7 pipe: Since Linux 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a system with a page size of 4096 bytes). Since Linux 2.6.35, the default pipe capacity is 16 pages, but the capacity can be queried and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations. – marcelm Oct 14 '22 at 09:40
  • marcelm and user10489 thank you, I'll edit my answer – Sotto Voce Oct 14 '22 at 11:26
  • 1
    If this causes you a problem, there is a tool called pv that will let you set an in-memory buffer of any size to allow the first process to run at full speed. – David Schwartz Oct 16 '22 at 07:27
  • 1
    Apparently you can change the buffer size of a pipe with an fcntl call up to the limit specified in /proc/sys/fs/pipe−max−size which, in turn, can be change with sufficient privileges; these days it appears to be 1 MB by default in Linux. – Peter - Reinstate Monica Oct 16 '22 at 17:25