17

I am trying to create a script which will start many background command. For each background command I need to get the return code.

I have been trying the following script :

 #!/bin/bash
set -x
pid=()
return=()


for i in 1 2
do
 echo start $i
 ssh mysql "/root/test$i.sh" &
 pid[$i]=$!
done

for i in ${#pid[@]}
do
echo ${pid[$i]}
wait ${pid[$i]}
return[$i]=$?

if [ ${return[$i]} -ne 0 ]
then
  echo mail error
fi

done

echo ${return[1]}
echo ${return[2]}

My issue is during the wait loop, if the second pid finish before the first one, I'll not be able to get the return code.

I know that I can run wait pid1 pid2, but with this command I can't get the return code of all commands.

Any idea ?

Hugo
  • 2,439

5 Answers5

10

The issue is more with your

for i in ${#pid[@]}

Which is for i in 2.

It should rather be:

for i in 1 2

or

for ((i = 1; i <= ${#pid[@]}; i++))

wait "$pid" will return the exit code of the job with bash (and POSIX shells, but not zsh) even if the job had already terminated when wait was started.

8

You can do this by using a temporary directory.

# Create a temporary directory to store the statuses
dir=$(mktemp -d)

# Execute the backgrouded code. Create a file that contains the exit status.
# The filename is the PID of this group's subshell.
for i in 1 2; do
    { ssh mysql "/root/test$i.sh" ; echo "$?" > "$dir/$BASHPID" ; } &
done

# Wait for all jobs to complete
wait

# Get return information for each pid
for file in "$dir"/*; do
    printf 'PID %d returned %d\n' "${file##*/}" "$(<"$file")"
done

# Remove the temporary directory
rm -r "$dir"
Chris Down
  • 125,559
  • 25
  • 270
  • 266
5

A generic implementation without temporary files.

#!/usr/bin/env bash

## associative array for job status
declare -A JOBS

## run command in the background
background() {
  eval $1 & JOBS[$!]="$1"
}

## check exit status of each job
## preserve exit status in ${JOBS}
## returns 1 if any job failed
reap() {
  local cmd
  local status=0
  for pid in ${!JOBS[@]}; do
    cmd=${JOBS[${pid}]}
    wait ${pid} ; JOBS[${pid}]=$?
    if [[ ${JOBS[${pid}]} -ne 0 ]]; then
      status=${JOBS[${pid}]}
      echo -e "[${pid}] Exited with status: ${status}\n${cmd}"
    fi
  done
  return ${status}
}

background 'sleep 1 ; false'
background 'sleep 3 ; true'
background 'sleep 2 ; exit 5'
background 'sleep 5 ; true'

reap || echo "Ooops! Some jobs failed"
1

Stéphane's answer is good, but I would prefer

for i in ${!pid[@]}
do
    wait "${pid[i]}"
    return_status[i]=$?
    unset "pid[$i]"
done

which will iterate over the keys of the pid array, regardless of which entries still exist, so you can adapt it, break out of the loop, and re-start the whole loop and it'll just work. And you don't need consecutive values of i to begin with.

Of course, if you're dealing with thousands of processes then perhaps Stépane's approach would be fractionally more efficient when you have a non-sparse list.

1

Bash 4.3 added -n to the wait builtin, and -p was added in version 5.1.

From https://www.gnu.org/software/bash/manual/html_node/Job-Control-Builtins.html

wait -n

If the -n option is supplied, wait waits for a single job from the list of pids or jobspecs or, if no arguments are supplied, any job, to complete and returns its exit status. [...]

wait -p

If the -p option is supplied, the process or job identifier of the job for which the exit status is returned is assigned to the variable varname named by the option argument. [...]

The combination of the two options means Bash 5.1+ is actually quite decent at basic multiprocessing. The main drawback now is really just tracking/managing stdout/stderr.

_job1 () { sleep "$( shuf -i 1-3 -n 1 )"s ; true ; }
_job2 () { sleep "$( shuf -i 1-3 -n 1 )"s ; return 42 ; }

limit="2" i="0"

set -- _job1 _job2 while [ "$#" -gt "0" ] ;do

until [ &quot;$i&quot; -eq &quot;$limit&quot; ] ;do
    printf 'starting %s\n' &quot;$1&quot;
    &quot;$1&quot; &amp;
    pids[$!]=&quot;$1&quot;
    i=&quot;$(( i + 1 ))&quot;
    shift
done

if wait -n -p ended_pid ;then
    return_code=&quot;$?&quot;
    printf '%s succeeded, returning &quot;%s&quot;\n' &quot;${pids[ended_pid]}&quot; &quot;$return_code&quot;
else
    return_code=&quot;$?&quot;
    printf '%s FAILED, returning &quot;%s&quot;\n' &quot;${pids[ended_pid]}&quot; &quot;$return_code&quot;
fi
unset 'pids[ended_pid]'
i=&quot;$(( i - 1 ))&quot;

done

while [ "${#pids[@]}" -gt "0" ] ;do if wait -n -p ended_pid ;then printf '%s succeeded, returning "%s"\n' "${pids[ended_pid]}" "$?" else printf '%s FAILED, returning "%s"\n' "${pids[ended_pid]}" "$?" fi unset 'pids[ended_pid]' done

More information (though not on wait -p): http://mywiki.wooledge.org/ProcessManagement