3

Given the pipeline

a | b | c

how might I alter b so that it aborts the pipeline if b generates an error or matches a particular pattern in the input stream?

  • 1
    Simply have b terminate.

    a will be killed by a SIGPIPE signal when trying to write to the left pipe, and c will get an EOF when trying to read from the right pipe. In bash (but not in the shell in general), you can get the exit status of b from the PIPESTATUS array.

    –  Dec 09 '19 at 03:00
  • How do I make b terminate? – Derek Mahar Dec 09 '19 at 03:13
  • 1
    Call exit from within it. Or let it commit ritual suicide: b(){ sed /die/q && kill "$BASHPID"; }; printf '%s\n' pass die oops | b | cat; echo "${PIPESTATUS[@]}" ;-) –  Dec 09 '19 at 03:30
  • @mosvy, I confirmed that b() aborts the pipeline using the ritual suicide operation kill "$BASHPID" or with exit 1. – Derek Mahar Dec 09 '19 at 04:25
  • @mosvy, do you want to promote your comment to an answer? – Derek Mahar Dec 09 '19 at 04:26
  • 1
    @mosvy b terminating would not terminate the pipeline in the (very degenerate and unlikely) case where there is no actual I/O between the processes in the pipeline. – Kusalananda Dec 09 '19 at 07:03
  • 1
    @Kusalananda in which case you can turn the job control on and kill the process group all the processes in the pipeline are part of –  Dec 09 '19 at 07:05
  • @mosvy, how can you determine the process group of the pipeline processes? – Derek Mahar Dec 09 '19 at 07:22
  • 1
    There's more than one way to do it. cat | cat | pkill -g0 | cat | cat will kill all 4 cats before them being killed by SIGPIPE when trying to write to pipe with no reader, or exiting with status 0 because of EOF. ps -ho pgrp "$BASHPID" will tell you the process group $BASHPID is in. You can also get the same info directly from /proc/<pid>/stat{,us}. –  Dec 09 '19 at 15:41
  • 1
    In a script (with no job control by default), you can also use a subshell to group processes for the purpose of killing them -- bash will always use separate processes for (...) subshells, and pkill and pgrep are able to find processes by their parent. –  Dec 09 '19 at 15:42
  • @mosvy, can you think of a way that an intermediate node in the pipeline might buffer the output and send it to final sink node c only if the input stream from a does not contain "die"? I tried using an intermediate "sponge" that https://unix.stackexchange.com/questions/337055/a-program-that-could-buffer-stdin-or-file describes following sed /die/{q 1} || pkill -g0, but the pipeline is subject to race conditions where sometimes it terminates and discards the input stream while other times c receives some input. – Derek Mahar Dec 09 '19 at 22:47
  • @mosvy, printf '%s\n' pass die oops | { file=$(mktemp); trap "rm $file" EXIT; sed '/die/{q 1}' > $file && cat $file || exit 2; } | cat; echo "${PIPESTATUS[@]}"; is an extension of your solution where node b in the pipeline discards the entire input stream if it encounters string "die". – Derek Mahar Dec 10 '19 at 17:11
  • Even shorter version that uses a buffer variable instead of a temporary file: printf '%s\n' pass die oops | { input=$(sed '/die/{q 1}') && echo "$input" || exit 2; } | cat; echo "${PIPESTATUS[@]}"; – Derek Mahar Dec 10 '19 at 17:24

1 Answers1

1

@mosvy's very helpful answer was mostly correct, but has the problem that b() always aborts the pipeline whether or not sed /die/q encounters "die":

Input stream contains "die"

$ b(){ sed /die/q && kill "$BASHPID"; }; printf '%s\n' pass die oops | b | cat; echo "${PIPESTATUS[@]}"
pass
die
0 143 0

Input stream does not contain "die"

$ b(){ sed /die/q && kill "$BASHPID"; }; printf '%s\n' pass oops | b | cat; echo "${PIPESTATUS[@]}"
pass
oops
0 143 0

In @mosvy's version, b() always aborts the pipeline because sed /die/q returns exit code 0 (success) if it encounters "die" or reaches the end of the input stream and so b() always invokes kill "$BASHPID".

In the following version, I correct @mosvy's answer so that b() aborts the pipeline only when it encounters "die" in the input stream:

Input stream contains "die"

b() {
  sed '/die/{q 2}' || kill "$BASHPID"
}

# Send "die" to b.
printf '%s\n' pass die oops | b | cat

echo "${PIPESTATUS[@]}"

Output:

pass
die
0 2 0

Input stream does not contain "die"

b() {
  sed '/die/{q 2}' || kill "$BASHPID"
}

# Do not send "die" to b.
printf '%s\n' pass oops | b | cat

echo "${PIPESTATUS[@]}"

Output:

pass
oops
0 0 0

Note that in this version of b(), if sed encounters "die", it invokes command q 2 which causes sed to terminate immediately with exit code 2 (failure), and then || to invoke kill "$BASHPID" which terminates b()'s process in the pipeline and aborts the pipeline. (Note that this version requires GNU sed which extends command q so that it accepts an exit code.)

As @mosvy mentions, instead of committing "ritual suicide", b() may simply exit from the process:

b() {
  sed '/die/{q 2}' || exit 3
}

# Send "die" to b.
printf '%s\n' pass die oops | b | cat

echo "${PIPESTATUS[@]}"

Output:

pass
die
0 3 0