Bash script wait for processes and get return code

Question

I am trying to create a script which will start many background command. For each background command I need to get the return code.

I have been trying the following script :

 #!/bin/bash
set -x
pid=()
return=()


for i in 1 2
do
 echo start $i
 ssh mysql "/root/test$i.sh" &
 pid[$i]=$!
done

for i in ${#pid[@]}
do
echo ${pid[$i]}
wait ${pid[$i]}
return[$i]=$?

if [ ${return[$i]} -ne 0 ]
then
  echo mail error
fi

done

echo ${return[1]}
echo ${return[2]}

My issue is during the wait loop, if the second pid finish before the first one, I'll not be able to get the return code.

I know that I can run wait pid1 pid2, but with this command I can't get the return code of all commands.

Any idea ?

score 10 · Answer 1 · answered Feb 21 '13 at 12:50

The issue is more with your

for i in ${#pid[@]}

Which is for i in 2.

It should rather be:

for i in 1 2

or

for ((i = 1; i <= ${#pid[@]}; i++))

wait "$pid" will return the exit code of the job with bash (and POSIX shells, but not zsh) even if the job had already terminated when wait was started.

score 8 · Accepted Answer · answered Feb 21 '13 at 12:45

You can do this by using a temporary directory.

# Create a temporary directory to store the statuses
dir=$(mktemp -d)

# Execute the backgrouded code. Create a file that contains the exit status.
# The filename is the PID of this group's subshell.
for i in 1 2; do
    { ssh mysql "/root/test$i.sh" ; echo "$?" > "$dir/$BASHPID" ; } &
done

# Wait for all jobs to complete
wait

# Get return information for each pid
for file in "$dir"/*; do
    printf 'PID %d returned %d\n' "${file##*/}" "$(<"$file")"
done

# Remove the temporary directory
rm -r "$dir"

score 5 · Answer 3 · answered Sep 19 '15 at 18:05

A generic implementation without temporary files.

#!/usr/bin/env bash

## associative array for job status
declare -A JOBS

## run command in the background
background() {
  eval $1 & JOBS[$!]="$1"
}

## check exit status of each job
## preserve exit status in ${JOBS}
## returns 1 if any job failed
reap() {
  local cmd
  local status=0
  for pid in ${!JOBS[@]}; do
    cmd=${JOBS[${pid}]}
    wait ${pid} ; JOBS[${pid}]=$?
    if [[ ${JOBS[${pid}]} -ne 0 ]]; then
      status=${JOBS[${pid}]}
      echo -e "[${pid}] Exited with status: ${status}\n${cmd}"
    fi
  done
  return ${status}
}

background 'sleep 1 ; false'
background 'sleep 3 ; true'
background 'sleep 2 ; exit 5'
background 'sleep 5 ; true'

reap || echo "Ooops! Some jobs failed"

Thank you :-) This is exactly what I was looking for! – Qorbani Dec 07 '18 at 02:40 — Qorbani, Dec 07 '18 at 02:40

score 1 · Answer 4 · edited Jan 02 '23 at 20:43

1

Stéphane's answer is good, but I would prefer

for i in ${!pid[@]}
do
    wait "${pid[i]}"
    return_status[i]=$?
    unset "pid[$i]"
done

which will iterate over the keys of the pid array, regardless of which entries still exist, so you can adapt it, break out of the loop, and re-start the whole loop and it'll just work. And you don't need consecutive values of i to begin with.

Of course, if you're dealing with thousands of processes then perhaps Stépane's approach would be fractionally more efficient when you have a non-sparse list.

edited Jan 02 '23 at 20:43

Community

1

answered Mar 15 '15 at 23:15

Martin Kealey

560

naming you array return freaked me out, man! – Toddius Zho Dec 20 '22 at 19:21
@ToddiusZho is return_status better? – Martin Kealey Jan 02 '23 at 20:44
(To be fair, the original question names the variable return; I was just copying that.) – Martin Kealey Jan 02 '23 at 20:51

unreal_square · Answer 5 · 2022-08-24T01:08:13.037

Bash 4.3 added -n to the wait builtin, and -p was added in version 5.1.

From https://www.gnu.org/software/bash/manual/html_node/Job-Control-Builtins.html

wait -n

If the -n option is supplied, wait waits for a single job from the list of pids or jobspecs or, if no arguments are supplied, any job, to complete and returns its exit status. [...]

wait -p

If the -p option is supplied, the process or job identifier of the job for which the exit status is returned is assigned to the variable varname named by the option argument. [...]

The combination of the two options means Bash 5.1+ is actually quite decent at basic multiprocessing. The main drawback now is really just tracking/managing stdout/stderr.

_job1 () { sleep "$( shuf -i 1-3 -n 1 )"s ; true ; }
_job2 () { sleep "$( shuf -i 1-3 -n 1 )"s ; return 42 ; }
limit="2"
i="0"
set -- _job1 _job2
while [ "$#" -gt "0" ] ;do
until [ &quot;$i&quot; -eq &quot;$limit&quot; ] ;do
    printf 'starting %s\n' &quot;$1&quot;
    &quot;$1&quot; &amp;
    pids[$!]=&quot;$1&quot;
    i=&quot;$(( i + 1 ))&quot;
    shift
done

if wait -n -p ended_pid ;then
    return_code=&quot;$?&quot;
    printf '%s succeeded, returning &quot;%s&quot;\n' &quot;${pids[ended_pid]}&quot; &quot;$return_code&quot;
else
    return_code=&quot;$?&quot;
    printf '%s FAILED, returning &quot;%s&quot;\n' &quot;${pids[ended_pid]}&quot; &quot;$return_code&quot;
fi
unset 'pids[ended_pid]'
i=&quot;$(( i - 1 ))&quot;


done
while [ "${#pids[@]}" -gt "0" ] ;do
    if wait -n -p ended_pid ;then
        printf '%s succeeded, returning "%s"\n' "${pids[ended_pid]}" "$?"
    else
        printf '%s FAILED, returning "%s"\n' "${pids[ended_pid]}" "$?"
    fi
    unset 'pids[ended_pid]'
done

More information (though not on wait -p): http://mywiki.wooledge.org/ProcessManagement

Bash script wait for processes and get return code

5 Answers5

Linked