9

I have written a script to ssh remote hosts, execute commands, save output to files, and examine outputs. But it always exit silently at line (( success++ )) when iterate first item in array workers. If I replace (( success++ )) with echo "process $worker", it will work fine and print all hosts. I cannot figure out what's wrong.

#!/bin/bash

set -x
set -e
workers=('host-1' 'host-2' 'host-3')

output_dir=$(mktemp -d)

for worker in ${workers[@]}; do
  ssh $worker '
    echo abc
    echo OK
  ' > "$output_dir/$worker" &
done

echo "waiting..."
sleep 3
wait

success=0
regexp='OK$'
for worker in ${workers[@]}; do
  output=`cat "$output_dir/$worker"`
  if [[ "$output" =~ $regexp ]]; then
    (( success++ ))
  fi
done

echo "Total ${#workers[@]}; success: $success; failure: $((${#workers[@]} - success))"
gzc
  • 325
  • Rather than reading the whole file into a variable, why not use if grep -q "$regexp" "$output_dir/$worker"; then? Or even grep -c "$regexp" "$output_dir"/* to get a count of the number of OKs. Also consider success=$(( success + 1 )). – Kusalananda Jun 21 '17 at 07:36
  • @Kusalananda That's a good advice. – gzc Jun 21 '17 at 09:41

2 Answers2

14

A simple example should explain why:

$ ((success++))
$ echo $?
1

The reason is that any arithmetic operation which produces a numeric value of zero returns 1. I don't know what to say - Bash has gotchas enough for the whole world.

l0b0
  • 51,350
  • Thanks! BTW, This behavior is very confusing, even ... evil. It consumes people's lives. – gzc Jun 21 '17 at 09:53
  • 1
    That rule was created with the expr utility(Linked Unix V7 manual): … exit codes: 0 if the expression is neither null nor 0, 1 if the expression is null or 0… So, perhaps, Unix is the one to blame. –  Jun 21 '17 at 10:24
  • 2
    if (( .. )) ; ... wouldn't work if (( .. )) didn't return sensible return values. Of course one might say that it should only fail if there is an explicit comparison (like (( i++ < n ))), but the implicit comparison against zero makes stuff like while (( i-- )) work in the same way as in C and other programming languages. – ilkkachu Jun 21 '17 at 10:55
  • @ilkkachu I understand the rationale, but this behaviour (in any language) is not sane. First, there are already ways to explicitly compare numbers, just use those. Second, the C language syntax is well known for lots of idiosyncrasies, and isn't exactly considered the gold standard. Third, the choice of zero is entirely arbitrary - why not every non-positive number instead? – l0b0 Jun 21 '17 at 11:47
  • @l0b0, telling boolean results apart from plain numbers would require typing, which the predecessors of C didn't really have, if I've understood my history lessons. (Neither has the shell's arithmetic.) From there it's just hysterical raisins. Though apparently, (( ... )) wasn't subject to set -e before Bash 4.1, and that's really what causes the problem here, not the return values of (( ... )) per se. – ilkkachu Jun 21 '17 at 12:05
9

It is the consequence of having -e set. Any command with an exit code of 1 (not zero) will trigger an exit.

This script works fine:

#!/bin/bash
(( success++))
echo "Still going 1 $success"

This doesn't

#!/bin/bash
set -e
(( success++))
echo "Still going 1 $success"

Solutions

The simplest is to remove the set -e line.

If that is not an option, Use this:

(( ++success ))

Other alternatives:

#!/bin/bash

set -e
success=0
success=$(( success+1 ))
echo "still going 1 $success"

success=0
(( success=success+1 ))
echo "still going 2 $success"

success=0
(( success+=1 ))
echo "still going 3 $success"

success=0
(( ++success ))
echo "still going 4 $success"

success=0
(( success++ ))
echo "still going 5 $success"

Only the option number 5 will have an exit code of 1.

Other (more complex solutions for any value of variable a).
The first one uses the (POSIX) colon (:) builtin to make it POSIX compatible.

: $(( a+=1 ))        ; echo "6 $a $?"   ## Valid Posix
   (( a++ )) || true ; echo "7 $a $?"
   (( a++ )) || :    ; echo "8 $a $?"
   (( a++ , 1 ))     ; echo "9 $a $?"
   (( a++ | 1 ))     ; echo "10 $a $?"
  • 2
    Those work when you're counting up from zero, but in the general case you'd need to squash the error exit with something like (( something... )) || true – ilkkachu Jun 21 '17 at 10:58
  • A couple of solutions more added @ilkkachu . Even one valid in posix. –  Jun 21 '17 at 20:01
  • 1
    @Arrow, good point on the : $(( .. )) alternative. It doesn't actually even need the || to catch the error, since there the command that runs is : and it always succeeds. – ilkkachu Jun 21 '17 at 23:15