1

Running on Red Hat EL7, we have these great long lines in our log files so I do a

tail -f Log | cut -c1-$COLUMNS

This works great on some systems but other--apparently identical--systems, the pipe holds the data until the buffer is full. As I was typing this, SE gave me this answer which when used:

tail -f Log | stdbuf -oL cut -c1-$COLUMNS

does what I need but I'd like to know what is different. I'd like the systems to run the same, good or bad.

Is there a default buffering that has been set? How was it set and where?

Update: I opened two windows into a system where the problem occurs and tried:

while date; do usleep 500000 ; done | cut -c1-100

and got no output (until the buffer is full). In the other window, I ran strace on the cut process and got an endless series of:

read(0, "Wed Oct 26 13:04:12 CDT 2022\n", 4096) = 29
read(0, "Wed Oct 26 13:04:12 CDT 2022\n", 4096) = 29
read(0, "Wed Oct 26 13:04:13 CDT 2022\n", 4096) = 29
read(0, "Wed Oct 26 13:04:13 CDT 2022\n", 4096) = 29
read(0, "Wed Oct 26 13:04:14 CDT 2022\n", 4096) = 29
read(0, "Wed Oct 26 13:04:14 CDT 2022\n", 4096) = 29
read(0, "Wed Oct 26 13:04:15 CDT 2022\n", 4096) = 29

I think that's pretty conclusive evidence that the cut is doing the buffering. But how does it decide to do so?

  • Which one do you mean, the output of tail or the output of cut? Because in what you're showing,tail -f` is the only one writing to a pipe, and AFAIK it shouldn't do buffering. – ilkkachu Oct 26 '22 at 17:43
  • Any difference between your systems regarding /proc/sys/fs/pipe-max-size ? – MC68020 Oct 26 '22 at 17:47
  • @MC68020, I never heard of it but the values are the same everywhere: 1048576 – user1683793 Oct 26 '22 at 17:51
  • @Ilkkachu All I can do is speculate about who is doing the buffering. I assumed it was the pipe but I believe pipes don’t work that way. The stdbuf command makes the data come out immediately on the systems where the problem occurs. Since the stdbuf precedes the cut, I assume the cut is doing the buffering. – user1683793 Oct 26 '22 at 18:00

1 Answers1

1

The usual behaviour is that output to a terminal is line-buffered, and anything else is block-buffered. See e.g. the GNU glibc manual:

Newly opened streams are normally fully buffered, with one exception: a stream connected to an interactive device such as a terminal is initially line buffered. [...]

So something like

grep ... | cut ...

would have grep buffer its output, but not cut. You'd fix that by running stdbuf -o0 grep ... (or grep --line-buffered) instead. Or using one of the many other workarounds, see: Turn off buffering in pipe

On the other hand, tail -f shouldn't buffer its output.

Of course, that doesn't fit with what you said about using stdbuf -oL on cut fixing it; it should be line-buffering its output already. If it's going to a terminal, that is. If you had | cut ... > somefile that would be different.

ilkkachu
  • 138,973