29

The issue of jq needing an explicit filter when the output is redirected is discussed all over the web. But I'm unable to redirect output if jq is part of a pipe chain, even when an explicit filter is in use.

Consider:

touch in.txt
tail -f in.txt | jq '.f1'
# in a different terminal:
echo '{"f1":1,"f2":2}' >> in.txt
echo '{"f1":3,"f2":2}' >> in.txt

As expected, the output in the original terminal from the jq command is:

1
3

But if I add any sort of redirection or piping to the end of the jq command, the output goes silent:

rm in.txt
touch in.txt
tail -f in.txt | jq '.f1' | tee out.txt
# in a different terminal:
echo '{"f1":1,"f2":2}' >> in.txt
echo '{"f1":3,"f2":2}' >> in.txt

No output appears in the first terminal and out.txt is empty.

I've tried hundreds of variations but it's an elusive issue. The only workaround I've found, as discovered through mosquitto_sub and The Things Network (which was where I also discovered the issue), is to wrap the tail and jq functions in a shell script:

#!/bin/bash
tail -f $1 | while IFS='' read line; do
echo $line | jq '.f1'
done

Then:

./tail_and_jq.sh | tee out.txt
# in a different terminal:
echo '{"f1":1,"f2":2}' >> in.txt
echo '{"f1":3,"f2":2}' >> in.txt

And sure enough, the output appears:

1
3

This is with the latest jq installed via Homebrew:

$ echo $SHELL
/bin/bash
$ jq --version
jq-1.5
$ brew install jq
Warning: jq 1.5_3 is already installed and up-to-date

Is this a (largely undocumented) bug in jq or with my understanding of pipe chains?

  • 1
    FWIW you have a fairly (well, slightly) strange setup here, using tail -f to provide continuous input to a program and tee to process the output. If you were still in need of an answer, I would have suggested simplifying the chain to <in.json jq '.f1' >out.json so that you could narrow down what's causing it. – David Z Apr 04 '18 at 07:23
  • See also BashFAQ #9 - What is buffering? Or, why does my command line produce no output: tail -f logfile | grep 'foo bar' | awk ... – Charles Duffy Apr 04 '18 at 16:48
  • All great advice for future efforts, thank you. FWIW, the tail bit came about from efforts to break the pipe down (run the first command, tee and redirect to file, tail that, pipe to next command, redirect to file, etc) and run it continuously in sections. The < is a good tool to keep in mind though. – Heath Raftery Apr 05 '18 at 06:59

2 Answers2

46

The output from jq is buffered when its standard output is not a terminal.

To request that jq flushes its output buffer after every object, use its --unbuffered option, e.g.

tail -f in.txt | jq --unbuffered '.f1' | tee out.txt

From the jq manual:

--unbuffered

Flush the output after each JSON object is printed (useful if you're piping a slow data source into jq and piping jq's output elsewhere).

Kusalananda
  • 333,661
  • Further, the way I would debug this, in order to figure out that output buffering was the issue, assuming I wouldn't simply guess that, would be to run the 'jq' portion under 'ltrace' and/or 'strace'. It would be obvious that it's calling C stdio output functions, but not calling the write(2) syscall. – AnotherSmellyGeek Apr 04 '18 at 08:17
  • 2
    @AnotherSmellyGeek Possibly, or the equivalent tracing utility on our Unices (note that the OP is using Homebrew, which means they're on macOS, and I'm on OpenBSD, neither of which has these Linux tools). Another possibility is to just know that output buffering may happen under certain circumstances :-) – Kusalananda Apr 04 '18 at 08:23
  • Brilliant. And really appreciate all the advice on debugging this in the future. Buffering was one of my first doubts, but the different behaviour for piping was flummoxing my debugging efforts. – Heath Raftery Apr 05 '18 at 06:55
10

What you're seeing here is the C stdio buffering in action. It will store output on a buffer until it reaches a certain limit (might be 512 bytes, or 4KB or larger) and then send that all at once.

This buffering gets disabled automatically if stdout is connected to a terminal, but when it's connected to a pipe (such as in your case), it will enable this buffering behavior.

The usual way to disable/control buffering is using the setvbuf() function (see this answer for more details), but that would need to be done in the source code of jq itself, so maybe not something practical for you...

There's a workaround... (A hack, one might say.) There's a program called "unbuffer", that's distributed with "expect" that can create a pseudo-terminal and connect that to a program. So, even though jq will still be writing to a pipe, it will think it's writing to a terminal, and the buffering effect will be disabled.

Install the "expect" package, which should come with "unbuffer", if you don't already have it... For instance, on Debian (or Ubuntu):

$ sudo apt-get install expect

Then you can use this command:

$ tail -f in.txt | unbuffer -p jq '.f1' | tee out.txt

See also this answer for some more details on "unbuffer", and you can find a man page here too.

filbranden
  • 21,751
  • 4
  • 63
  • 86