5

I'm executing a program with an stdin redirection:

 $ prog < f

In that case, the stdin is fully buffered.

Is there a trick to make it unbuffered or line buffered?

EDIT.

Without modifying the program source code (i.e using setvbuf())

  • 2
    prog could read from stdin character by character, or by line, depending on what code prog uses, so I don't see how the input is fully buffered – thrig Dec 16 '17 at 15:13
  • Related: Turn off buffering in pipe. Slightly different question, same answer. – Mark Plotnick Dec 16 '17 at 15:15
  • 1
    @thrig, there's no way a program can read line by line as it can't know in advance where the newline characters will be in its input. It cannot read character by character either as it can't know in advance how many bytes each character will be. All it can adjust is how many bytes it reads at a time. – Stéphane Chazelas Dec 16 '17 at 23:00
  • 1
    @Hedi, what do you mean by unbuffering stdin? Input reading is always into a buffer. Are you wanting to adjust the size of that buffer, how many bytes prog reads at a time? What do you want to achieve with that? – Stéphane Chazelas Dec 16 '17 at 23:04

2 Answers2

8

Unbuffering generally makes more sense for output. Output buffering is where an application holds on to its output before writing it until it has accumulated enough so as to minimize the number of I/Os.

On input, all an application can do is adjust how many bytes it reads from its input at a time (well, that it requests at least, as it's not guaranteed to receive as many; the file might have fewer bytes available at the time like for pipes or tty devices).

With stdio, unbuffering an input stream, is set the size of that buffer to one byte.

Reading one byte at a time is inefficient and that's often not needed.

The cases where it may be needed is when reading from a non-seekable input (like pipes, so not your f if it's a regular file) and prog needs to stop reading at a given point in the file, so that another process can resume the reading at that point.

For instance, in:

seq 10 | { grep -q 5; cat; }

if you wanted cat to output the lines 6 to 10, that is the lines after where grep stopped reading the file (here a pipe, so not-seekable).

That command above returns nothing because grep has read all of seq's output in one gulp.

Note that if you had written:

{ seq 5; sleep 1; seq 6 10; } | { grep -q 5; cat; }

That would have worked. grep also requests a large buffer, but since only the first 5 lines were available at the time, grep processed them and exited upon the 5th. In other words, it does not accumulate its input until the buffer is full (or eof is reached) to start processing it (the only command that I know that does something like that is mawk).

With some commands, on GNU and FreeBSD systems, you can adjust the input buffering with stdbuf -i. Using stdbuf -i0 (unbuffering) will be the same as stdbuf -i1 (read into a buffer of size 1) and cause the input to be read one byte at a time.

It doesn't work with GNU grep, but it does with GNU sed:

$ seq 10 | { sed -n /5/q; cat; }
$ seq 10 | { stdbuf -i0 sed -n /5/q; cat; }
6
7
8
9
10

With strace, you can see the size of the read()s being adjusted:

$ seq 5 | { strace -e read sed -n /2/q; cat; }
[...]
read(0, "1\n2\n3\n4\n5\n", 4096)        = 10
+++ exited with 0 +++
$ seq 5 | { strace -e read stdbuf -i0 sed -n /2/q; cat; }
[...]
read(0, "1", 1)                         = 1
read(0, "\n", 1)                        = 1
read(0, "2", 1)                         = 1
read(0, "\n", 1)                        = 1
+++ exited with 0 +++
3
4
5
$ seq 5 | { strace -e read stdbuf -i1 sed -n /2/q; cat; }
[...]
read(0, "1", 1)                         = 1
read(0, "\n", 1)                        = 1
read(0, "2", 1)                         = 1
read(0, "\n", 1)                        = 1
+++ exited with 0 +++
3
4
5
1

No, there is not.

Your alteration of the question explicitly precludes the way to do it. One edits the program to do what one wants the program to do, or one uses a tool that hooks into the internals of the dynamic loader and C runtime library to arrange to call setvbuf at program startup.

If using the setvbuf function is not allowed, there is simply no way to do this. Calling setvbuf is what one needs to do.

Further reading

JdeBP
  • 68,745
  • The poster specified that they didn't want to edit the program source code to use setvbuf, but they didn't specify that setvbuf couldn't be called in some other way that doesn't involve editing the code, e.g. using stdbuf(1). – Colin Watson Dec 17 '17 at 01:06
  • That's not how "i.e. using setvbuf" reads to me. Using is using, whether via stdbuf or otherwise. – JdeBP Dec 17 '17 at 02:53
  • Given that that was a parenthesis clarifying "Without modifying the program source code", I think the poster wrote "i.e." when they meant "e.g." - a very common error. – Colin Watson Dec 18 '17 at 10:03