82

So I like to harden my bash scripts wherever I can (and when not able to delegate to a language like Python/Ruby) to ensure errors do not go uncaught.

In that vein I have a strict.sh, which contains things like:

set -e
set -u
set -o pipefail

And source it in other scripts. However, while pipefail would pick up:

false | echo it kept going | true

It will not pick up:

echo The output is '`false; echo something else`' 

The output would be

The output is ''

False returns non-zero status, and no-stdout. In a pipe it would have failed, but here the error isn't caught. When this is actually a calculation stored in a variable for later, and the value is set to blank, this may then cause later problems.

So - is there a way to get bash to treat a non-zero returncode inside a backtick as reason enough to exit?

Danny Staple
  • 2,161
  • 1
  • 15
  • 22

7 Answers7

62

The exact language used in the Single UNIX specification to describe the meaning of set -e is:

When this option is on, if a simple command fails for any of the reasons listed in Consequences of Shell Errors or returns an exit status value >0, and is not [a conditional or negated command], then the shell shall immediately exit.

There is an ambiguity as to what happens when such a command occurs in a subshell. From a practical point of view, all the subshell can do is exit and return a nonzero status to the parent shell. Whether the parent shell will in turn exit depends on whether this nonzero status translates into a simple command failing in the parent shell.

One such problematic case is the one you encountered: a nonzero return status from a command substitution. Since this status is ignored, it does not cause the parent shell to exit. As you've already discovered, a way to take the exit status into account is to use the command substitution in a simple assignment: then the exit status of the assignment is the exit status of the last command substitution in the assignment(s).

Note that this will perform as intended only if there is a single command substitution, as only the last substitution's status is taken into account. For example, the following command is successful (both according to the standard and in every implementation I've seen):

a=$(false)$(echo foo)

Another case to watch for is explicit subshells: (somecommand). According to the interpretation above, the subshell may return a nonzero status, but since this is not a simple command in the parent shell, the parent shell should continue. In fact, all the shells I know of do make the parent return at this point. While this is useful in many cases such as (cd /some/dir && somecommand) where the parentheses are used to keep an operation such as a current directory change local, it violates the specification if set -e is turned off in the subshell, or if the subshell returns a nonzero status in a way that would not terminate it, such as using ! on a true command. For example, all of ash, bash, pdksh, ksh93 and zsh exit without displaying This should be displayed in the following examples:

set -e; (set +e; false); echo "This should be displayed"
set -e; (! true); echo "This should be displayed"

Yet no simple command has failed while set -e was in effect!

A third problematic case is elements in a nontrivial pipeline. In practice, all shells ignore failures of the elements of the pipeline other than the last one, and exhibit one of two behaviors regarding the last pipeline element:

  • ATT ksh and zsh, which execute the last element of the pipeline in the parent shell, do business as usual: if a simple command fails in the last element of the pipeline, the shell executing that command, which happens to be the parent shell, exits.
  • Other shells approximate the behavior by exiting if the last element of the pipeline returns a nonzero status.

Like before, turning off set -e or using a negation in the last element of the pipeline causes it to return a nonzero status in a way that should not terminate the shell; shells other than ATT ksh and zsh will then exit.

Bash's pipefail option causes a pipeline to exit immediately under set -e if any of its elements returns a nonzero status.

Note that as a further complication, bash turns off set -e within command substitutions but not regular subshells (i.e. inside of `` or $(), but not ()), unless bash is in POSIX mode (i.e. set -o posix, POSIXLY_CORRECT is set in the environment when bash starts, or bash is invoked as sh) in which case the current setting of e is inherited from the parent shell at the time of invocation (regardless if it's a command substitution or subshell).

All of this shows that the POSIX specification unfortunately does a poor job at specifying the -e option. Fortunately, existing shells are mostly consistent in their behavior.

tmillr
  • 5
  • Thanks for this. An experience I had was that some errors caught in a later version of bash, were ignored in earlier versions with set -e. My intent here is to harden scripts to the extent that any non-handled error return/failure condition will cause a script to exit. This is in a legacy system which has been known to produce garbage output files and a "0" happy exit code after a half an hour of chaos with the wrong env - unless you were watching output like a hawk (and not all of the errors are on stderr, some are on stdout, and some parts are /dev/null'd), you just don't know. – Danny Staple Oct 25 '11 at 23:11
  • "For example, all of ash, bash, pdksh, ksh93 and zsh exit without displaying foo on the following examples": BusyBox ash doesn't behave that way, it displays the output as per spec. (Tested with BusyBox 1.19.4.) Nor does it exit with set -e; (cd /nonexisting). – dubiousjim Jan 13 '13 at 21:39
37

(Answering my own because I've found a solution) One solution is to always assign this to an intermediate variable. This way the returncode ($?) is set.

So

ABC=`exit 1`
echo $?

Will output 1 (or instead exit if set -e is present), however:

echo `exit 1`
echo $?

Will output 0 after a blank line. The return code of the echo (or other command that ran with the backtick output) will replace the 0 return code.

I am still open to solutions that do not require the intermediate variable, but this gets me some of the way.

vadipp
  • 208
Danny Staple
  • 2,161
  • 1
  • 15
  • 22
32

As the OP pointed out in his own answer, assigning the output of the subcommand to a variable does solve the problem; the $? is left unscathed.

However, one edge case can still puzzle you with false negatives (i.e. command fails but error doesn't bubble up), local variable declaration:

local myvar=$(subcommand) will always return 0!

bash(1) points this out:

   local [option] [name[=value] ...]
          ... The return status is 0 unless local is used outside a function,
          an invalid name is supplied, or name is a readonly variable.

Here's a simple test case:

#!/bin/bash

function test1() {
  data1=$(false) # undeclared variable
  echo 'data1=$(false):' "$?"
  local data2=$(false) # declaring and assigning in one go
  echo 'local data2=$(false):' "$?"
  local data3
  data3=$(false) # assigning a declared variable
  echo 'local data3; data3=$(false):' "$?"
}

test1

The output:

data1=$(false): 1
local data2=$(false): 0
local data3; data3=$(false): 1
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
mogsie
  • 221
23

Solution

If you are running Bash 4.4 or later, you can use the shopt option inherit_errexit to do just that. You can check compatibility from within Bash using echo $BASH_VERSION.

Here is the shebang you would use if Bash 4.4 or later were installed and came before /bin in your $PATH:

#!/usr/bin/env -S bash -euET -o pipefail -O inherit_errexit

The -S is there to coax Linux’s env into accepting more than one argument for bash, as kindly pointed out by @UVV and explained further on StackOverflow.


Background

inherit_errexit is an option to shopt, while the rest of the arguments are options to set. In most modern iterations, they can be passed directly to bash when invoking the shell.

Let’s review the options you have already been using:

  • -u/-o nounset, as the name ambiguously hints, disallows dereferencing of variables that have not been set; e.g., $IJUSTMADETHISUP. This mostly helps guard against typos, or oversights when copy-pasting from Stack Overflow
  • -e/-o errexit does some of what you are requesting: if called directly†, shell commands with nonzero status codes will cause the shell to exit with an error. That is, everything preceding a line of shell script must “go right” in order for that line to execute.
  • -o pipefail is needed to extend this guarantee to commands whose output is redirected with an I/O pipe |.

† i.e., not from within a subshell (...)

Now for the options I’ve added:

  • -O inherit_errexit further extends this functionality (exiting on nonzero status code) to commands called from within subshells. This closes an important loophole, since subshells are used for command substitution $(...), `...` and process substitution <(...), >(...), and both are found in many shell scripts in the wild.‡
  • The -E/-o errtrace and -T/-o functrace options are there for the comparatively rare case that you use trap to perform an action when the shell receives a signal. These two options extend signal handlers to the inner bodies of shell functions for ERR signals and DEBUG/RETURN signals, respectively.

‡ This is just one reason to prefer the $(...) syntax for command substitution, since the parentheses make it explicit that you are entering a subshell. It also happens to be nestable.


See also

Ron Wolf
  • 346
  • 2
  • 7
  • 1
    Thank you, this is great. The title of the article you link to, "Fail fast bash scripting" perfectly articulates what this is about. Great answer. – Danny Staple Jul 25 '20 at 08:30
  • 2
    Worth mentioning, inherit_errexit only worked for me in cases like VAR=foo-$(some_invalid_command). In cases like tar cvzf foo-$(some_invalid_command).tar.gz bar/, I still get a $? of zero. Haven't yet figured out how to workaround that... – Per Lundberg Oct 20 '20 at 09:28
  • A workaround that seems to do the trick: set a local variable and use && together with that. Like this: foo=$(some_invalid_command) && tar cvzf foo-$foo.tar.gz bar/. This way, the whole command will return non-zero (i.e. failure) in case the subshell fails, with -O inherit_errexit. – Per Lundberg Oct 20 '20 at 18:36
  • @PerLundberg It's expected that $? would be 0, since the command that determines the status code is the last one running, as I recall. If rather than “failing fast” you're looking for Bash to continue running, while making the nonzero status code available, then your solution works. – Ron Wolf Nov 05 '20 at 00:59
  • @RonWolf It turned out I was wrong in my second post - this option (using && to propagate the failure from the first invocation) works regardless of whether -O inherit_errexit or not. But as for fail fast - my approach in that comment does indeed short-circuit (it doesn't run the second command if the first one fails). I'm not sure I get what you mean there? – Per Lundberg Nov 05 '20 at 19:56
  • 1
    @PerLundberg I believe I was responding to your first post and missed the second one for some reason. I agree with what you said there, that assigning to a variable first is a good idea! – Ron Wolf Nov 21 '20 at 00:28
  • 1
    Just want to mention that you have to add -S option to /usr/bin/env otherwise you get an error /usr/bin/env: use -[v]S to pass options in shebang lines – UVV Dec 20 '20 at 09:31
  • @UVV Thanks for the feedback! I’ve edited the answer accordingly. – Ron Wolf Dec 30 '20 at 02:34
  • @PerLundberg Looking back on this, I think a solution that might work for you is piping to the read built-in command, which lets you declare and name a variable to store its stdin. That way, you can store the output of another command without confining its exit status to a subshell. – Ron Wolf Feb 23 '21 at 03:23
  • Would you mind splitting out set -o errtrace from set -o functrace (so that I know exactly what each one does)? I'm trying to figure out if I should use set -o functrace bad I'm having a hard time doing so. – felipecrs Jul 28 '22 at 23:44
  • @felipecrs That sounds like a different topic. I recommend checking your shell's documentation, and if that doesn't settle things, posting a new question. – Ron Wolf Jan 21 '23 at 22:07
15

As others have said, local will always return 0. The solution is to declare the variable first:

function testcase()
{
    local MYRESULT

    MYRESULT=$(false)
    if (( $? != 0 )); then
        echo "False returned false!"
        return 1
    fi

    return 0
}

Output:

$ testcase
False returned false!
$ 
Will
  • 2,754
5

To exit on a command substitution failure you may explicitly set -e in a subshell, like this:

set -e
x=$(set -e; false; true)
echo "this will never be shown"
errr
  • 51
1

Interesting point!

I've never stumbled across that, because I'm no friend of set -e (Instead I prefer to trap ... ERR) but already tested that: trap ... ERR also don't catch errors within $(...) (or the oldfashioned backticks).

I think the problem is (as so often) that here a subshell is called and -e explicitely means the current shell.

Only other solution that came to mind at this moment would be to use read:

 ls -l ghost_under_bed | read name

This throws ERR and with -e the shell will be terminated. Only problem: this works only for commands with one line of output (or you pipe through something that joins lines).

ktf
  • 2,717
  • 1
    Not sure about the read trick, attempting it lead to the variable not being bound. I think this may be because the other side of the pipe is effectively a subshell, name won't be available. – Danny Staple Oct 21 '11 at 15:09