4

I have a bash script containing a group of commands in curly braces { ... }. This group contains some initial echo commands and then one loop. At each iteration the loop executes various slow commands (basically with curl and some extra parsing). Each iteration is slow (because of network interaction) but it prints one line (of python code); as far as I can see, there should be no buffering issue coming from the commands themselves because they terminate their job and leave.

The whole group of commands is piped to python -u (I also tried with tail -f in order to check) and obviously the whole loop is executed before anything is read by python -u or tail -f.

I know how to unbuffer (when possible) one command with various tools like stdbuf but I don't think it can help here because it looks like the issue comes from the command-grouping rather than from such or such command.

Any hint?

2 Answers2

5

(Note to future readers: the tone of exasperation here is not for the question, but for the mistakes I made trying to answer it and the multiple edits they entailed.)

Oh, for pity's sake. The problem is in tail -f. This works just fine:

#!/bin/bash
printf 'hi\n'
{
    for i in 1 2 3 4; do
        sleep 0.5
        /bin/echo $i
    done;
} | cat
printf 'bye\n'

It's not the pipe, it's not the group. It's tail. As in, chasing our own tails!

So, tail -f failed because it doesn't output right away for some reason. Not sure why python -u is failing, but I don't think it's anything in the script. Maybe try unbuffer with it. Try your script with cat, at least, and verify that it's unbuffered in that case.


Earlier failed attempt intentionally left here so future readers can make sense of the comments.

This script exhibits the same kind of buffering problem you're getting:

#!/bin/bash
printf 'hi\n'
{
    for i in 1 2 3 4; do
    sleep 0.5
    printf '%s\n' $i
    done;
} | tail -f
printf 'bye\n'

This one does not. Output inside the group is redirected to stderr, then stderr from the whole group is piped to the command. Since it's stderr, it's unbuffered.

#!/bin/bash
printf 'hi\n'
{
    for i in 1 2 3 4; do
    sleep 0.5
    printf '%s\n' $i 1>&2
    done;
} |& tail -f
printf 'bye\n'

Adapted from Wang HongQin's answer in this question. The difficulty was in finding a way to unbuffer the pipe with braces rather than an explicit command. Had to fiddle around a while to get the redirection working properly.

Tom Zych
  • 933
  • Hummm... You actually see very well the issue. I see the idea (and I also read the original post by Wang HongQin), but after having tried your very own example by using printf 'print %s\n' $i and by replacing tail -f by python -u, it looks like the print statements are getting displayed without being read by python. I don't think the output is really piped, though it looks like it is (since tail -f is supposed to print something on the screen which happens to be the case here). I am pretty sure that tail in your example doesn't get anything. – Thomas Baruchel Nov 28 '15 at 16:47
  • Yes, sorry, stuff was going out via stderr. I realized it after I posted and had to delete while I worked it out. Revision should work. – Tom Zych Nov 28 '15 at 16:48
  • Don't be sorry; I learn much here ;-) But I still don't think that the new version actually works :-( Could you try with python -u and by replacing your four numbers by a valid python print statement? – Thomas Baruchel Nov 28 '15 at 16:50
  • Sigh, no, you're right. I had turned off the sleep for testing and forgot to make sure that output was unbuffered. I have RL stuff now and can't work on this. I'll leave it up, maybe someone else can start from here and work out the bugs. – Tom Zych Nov 28 '15 at 16:56
  • these examples are not at all pertinent to the question, unless your printf is /bin/printf (though i would doubt it even in that case). the shell doesn't do output buffering in the same way most programs do, and sleep doesn't buffer writes at all because it doesn't do any. – mikeserv Nov 28 '15 at 20:33
  • @mikeserv: The sleep was just to delay things so I could tell whether it was buffering or not. The printf issue occurred to me too, so I changed it to /bin/echo for future testing. Haven't solved it yet, though. I don't see a way to pipe stderr directly and I suspect this whole approach is unworkable. Trying something else now. Using stdbuf -o0 /bin/echo failed too. – Tom Zych Nov 28 '15 at 23:40
  • the problem is not in tail -f exactly - it does what you'd expect for a program that has to read the same file over time even though its already reached end of file over and over - it loops over it. obviously its not going to check the file for new information all of the time - that would be terribly wasteful - the typical tail -f implementation checks every 60 seconds. – mikeserv Nov 29 '15 at 02:47
  • You are perfectly right with your cat proof; thus my issue obviously comes from the behaviour of python -u which doesn't seem to parse its stdin as unbuffered. – Thomas Baruchel Nov 29 '15 at 10:39
  • I am not sure that the code with cat "works" for the reason you think. This code { echo foo >> mylog; echo bar; echo foo >> mylog } | cat >> mylog creates mylog with lines foo, foo, bar, indicating that what was piped to cat was buffered. On the other hand, { echo foo >> mylog; echo bar; sleep 0.5; echo foo >> mylog } | cat >> mylog produces mylog with lines foo, bar, foo. This suggests that somehow sleep flushes the buffer. – user102008 Nov 22 '22 at 21:45
1

you just have to do:

{   stdbuf -o0 curl ...
    stdbuf -o0 whatever ...
}|  tail -f

...which will work for dynamically linked applications, though i'm pretty sure curl includes its own unbuffer switch of some kind.

mikeserv
  • 58,310