What is the most succinct way of terminating the rest of a pipeline if a command fails?

Question

Consider the following:

command1 | command2 | command3

As I understand pipelines every command is run regardless of any errors which may occur. When a command returns stderr, it is not piped to the next command, but the next one is still run (unless you use |&). I want any error which may occur to terminate the rest of the pipeline. I thought set -o pipefail would accomplish this, but it simply terminates anything which may come after the pipeline if anything in the pipeline failed, ie:

(set -o pipefail; cmd1 | cmd2 && echo "I won't run if any of the previous commands fail")

So, What is the most succinct way terminate the rest of the pipeline if any of its commands fail? I also need it to exit with the proper stderr of the command which failed. I'm doing this from a command-line context, not a shell script, hence why I'm looking for brevity. Thoughts?

The many general questions include https://unix.stackexchange.com/q/268344/5132 , https://unix.stackexchange.com/q/513657/5132 , https://unix.stackexchange.com/q/433345/5132 , and others. — JdeBP, May 09 '19 at 12:03
Commands in a pipeline are executed in parallel. When a command in a pipeline terminates (success or fail) it will trigger other commands to terminate by closing it's own side of the pipes. Do you have a reason to terminate these commands forcefully early? — Philip Couling, May 09 '19 at 12:27
My second command writes files. If it doesn't receive stdout from the prior command in the pipeline then it will write empty files (cannot change that behaviour, it's from an external lib). If the first command fails, I'm only interested in knowing its stderr, and that the other commands in the pipeline do not run. — Audun Olsen, May 09 '19 at 12:36
Based on your last comment and considering that, as Philip says, all the commands in a pipeline are run in parallel: is it correct to say that you are looking for a way to prevent some commands in a pipeline from being started? (Note the difference between not starting and terminating). E.g., as a proof of concept, are you looking for some way to run cat nonexistentfile | cat - >outfile without outfile being created/truncated, while still getting the error from cat nonexistentfile? — fra-san, May 09 '19 at 13:19
I see. In that case, I think that the commands may start, but if an error occurs then that is the only form of stdoutput which I care about. Your example is spot on. — Audun Olsen, May 09 '19 at 13:29

Philip Couling · Accepted Answer · 2019-05-09T14:01:12.647

I believe that it's not possible

I believe that what you're asking for is not directly possible because of the way pipelines are executed. The shell does not know about the success or failure (return value) of a command when it executes "later" commands in the pipe. It literally runs all of them at the same time and collects up the results after.

There are a couple of workarounds which might help.

Workaround 1

Execute the commands one at a time and cache the results This is better because later commands absolutely will not run if an earlier command failed.

A very short script example:

cache_file=`tempfile`
if command1 > $cache_file ; then
    command2 < $cache_file
fi
rm $cache_file

Workaround 2

Execute everything but check the return results This will still run all commands no matter what, but it does let you get back to find the cause.

Here each command's STDERR is redirected to a different file with 2>. Then PIPESTATUS is checked to find the return code of each command.

command1 2> command1_err.log | command2 2> command2_err.log
for result in ${PIPESTATUS[@]} ; do
    if [ $result -ne 0 ] ; then
        echo command failed
    fi
done

A brief overview of running pipelines in a shell

To create the pipeline, the shell follows steps similar to these:

The creates each pipe (|) using pipe(). Each pipe comes with a read handle and write handle. For each readirection (<, >) it opens the respective file obtaining a handle to that file using open().
The shell calls fork() once for each command in the pipe to start a new process.
Each child process swaps it's STDIN, STDOUT and STDERR handles for those created in (1.).
Assuming the command is an external binary each child process then calls exec() to load and run the binary.
The parent process then waits for the child to complete using wait() which also provides the command's return value (success or failure).

Just Khaithang · Answer 2 · 2022-07-24T08:50:11.663

One hacky way to stop the pipeline is to kill the shell midway. Now that would mean killing your interactive shell (if you are running it from terminal emulator) so you have to launch another shell to continue your work. But this at least stops the following pipeline commands.

command1 2> command1_error.log | awk -v status=$? -v pid=$$ '{if (status != 0) { system("kill -9 " pid) } else { print } }' | command2 ..

Note: this would log you out if you are running it from terminal/tty

Or, if you want to prevent the last command ( e.g. the command to create a file is the last command in the pipeline), you can just use xargs -r

command1 | command2 | xargs -r command3

the -r flag would prevent xargs from running command3 if the previous command outputted nothing.

What is the most succinct way of terminating the rest of a pipeline if a command fails?

2 Answers2

I believe that it's not possible

Workaround 1

Workaround 2

A brief overview of running pipelines in a shell