dash: Pipe STDIN to multiple commands and their output to STDOUT in defined order

Question

At first I thought this answer was the solution, but now I think I need a temporary file as buffer.

This works unreliably:

#!/bin/sh
echo 'OK' |
{
    {
        tee /dev/fd/3 | head --bytes=1 >&4
    } 3>&1 | tail --bytes=+2 >&4
} 4>&1

When I run this in a terminal, sometimes I get:

OK

and sometimes I get:

K
O

Seems totally random. So as a workaround I'm writing the output of tail to a file and reading it back to stdout after the pipe has finished.

#!/bin/sh
echo 'OK' |
{
    {
        tee /dev/fd/3 | head --bytes=1 >&4
    } 3>&1 | tail --bytes=+2 >file
} 4>&1
cat file

Can this be done in dash without temporary files? Shell variables as buffer aren't an option either, as the output might contain NUL bytes.

Possible duplicate of How can I send stdout to multiple commands? — Sergiy Kolodyazhnyy, Feb 05 '17 at 18:21
There's multiple other solutions in the post, including named pipes and process substitution with bash . Try them first — Sergiy Kolodyazhnyy, Feb 05 '17 at 18:27
Thanks @Serg, as I'm using dash I don't have access to process substitution. (I added dash to the title.) With named pipes I have the exact same problem (arbitrary order of execution). — Kontrollfreak, Feb 05 '17 at 18:32
OK, very good. That's going to be relevant info. I'll see what I can do about dash then — Sergiy Kolodyazhnyy, Feb 05 '17 at 18:35
Any time you're running multiple processes in parallel the ordering of output is going to be unspecified. — Michael Homer, Feb 05 '17 at 19:28
That's what I feared. So I'm going to need a buffer. Guess I'll fix something with mktemp -d on /dev/shm. Any better ideas? — Kontrollfreak, Feb 05 '17 at 20:09
So, I've posted an answer on the linked question. http://unix.stackexchange.com/a/342717/85039 See if that works for you. Of course, it's for relatively small output, doesn't work if you have huge amount of data that you need to give to multiple commands, but in your specific case should work. I tested that with dash on my system as well — Sergiy Kolodyazhnyy, Feb 05 '17 at 20:23

Kusalananda · Answer 1 · 2017-09-17T14:43:36.970

1

The best solution is to use temporary files. This makes the code readable and easy to understand when process substitution is not an option.

tmpfile=$(mktemp)

producer | tee "$tmpfile" | consumer1
consumer2 <"$tmpfile"

rm -f "$tmpfile"

or even

tmpfile=$(mktemp)

producer >"$tmpfile"

consumer1 <"$tmpfile"
consumer2 <"$tmpfile"

rm -f "$tmpfile"

edited Sep 17 '17 at 14:43

answered Sep 17 '17 at 13:07

Kusalananda

333,661

bash doesn't use temp files for process substitution. Only temporary named pipes on systems that don't support /dev/fd/n. Only zsh (with the =(...) syntax) or fish (with psub -f) can use temp files for process substitutions. It's true though that the solution here is to use temp files (not fifos) as we need to run the commands one after the other, not in parallel. Other option would be to run the commands in parallel but with their output to different files which are concatenated afterwards (the first one could be left unredirected). That's what GNU parallel does. – Stéphane Chazelas Sep 17 '17 at 13:24

Stéphane Chazelas · Accepted Answer · 2017-09-17T14:26:36.540

If you wanted to run the consumers and producer in parallel, but serialize the output of the consumers, you'd need to delay the output of the second consumer. For that, you'd need to store its output somehow and the best way is with a temporary file.

With zsh:

{cat =(producer > >(consumer1 >&3) | consumer2)} 3>&1

bash has an issue in that it doesn't wait for the process substitution commands, so you'd have to use nasty work arounds there.

Here, we're using the =(...) form of process substitution to store the output of comsumer2 in a temporary file and cat it afterwards. We can't do that for more than 2 consumers. For that, we'd need to create the temp files by hand.

When not using =(...), we'd have to handle the clean up of the tempfiles by hand. We can handle that by creating and deleting them up front so not to have to worry about the cases where the script is killed. Still with zsh:

tmp1=$(mktemp) && tmp2=$(mktemp) || exit
{
  rm -f -- $tmp1 $tmp2
  producer > >(consumer1) > >(consumer2 >&3) > >(consumer3 >&5)
  cat <&4 <&6
} 3> $tmp1 4< $tmp1 5> $tmp2 6< $tmp2

Edit (I initially missed the fact that a solution for dash was required)

For dash (or any POSIX shell that doesn't set the close-on-exec flag on fds above 2 and uses pipes and not socketpairs for |), and on systems with /dev/fd/x support:

tmp1=$(mktemp) && tmp2=$(mktemp) || exit
{
  rm -f -- "$tmp1" "$tmp2"
  {
    {
      {
        producer | tee /dev/fd/4 /dev/fd/6 | consumer1 >&7
      } 4>&1 | consumer2 >&3
    } 6>&1 | consumer3 >&5
  } 7>&1
  cat - /dev/fd/6 <&4
} 3> "$tmp1" 4< "$tmp1" 5> "$tmp2" 6< "$tmp2"

That would work with dash, bash, zsh, mksh, busybox sh, posh on Linux, but not ksh93. That approach can't go beyond 4 consumers as we're limited to fds 0 to 9.

dash: Pipe STDIN to multiple commands and their output to STDOUT in defined order

2 Answers2