I have to run a bunch of bash commands asynchronously and as soon as one finishes, I need to perform actions according to its exit code and output. Note that I can't predict how for long any of these tasks will run in my real use case.
To solve this problem, I ended up with the following algorithm:
For each task to be run:
Run the task asynchronously;
Append the task to the list of running tasks.
End For.
While there still are tasks in the list of running tasks:
For each task in the list of running tasks:
If the task has ended:
Retrieve the task's exit code and output;
Remove the task from the list of running tasks.
End If.
End For
End While.
This gives me the following bash script:
1 #!/bin/bash
2
3 # bg.sh
4
5 # Executing commands asynchronously, retrieving their exit codes and outputs upon completion.
6
7 asynch_cmds=
8
9 echo -e "Asynchronous commands:\nPID FD"
10
11 for i in {1..10}; do
12 exec {fd}< <(sleep $(( i * 2 )) && echo $RANDOM && exit $i) # Dummy asynchronous task, standard output's stream is redirected to the current shell
13 asynch_cmds+="$!:$fd " # Append the task's PID and FD to the list of running tasks
14
15 echo "$! $fd"
16 done
17
18 echo -e "\nExit codes and outputs:\nPID FD EXIT OUTPUT"
19
20 while [[ ${#asynch_cmds} -gt 0 ]]; do # While the list of running tasks isn't empty
21
22 for asynch_cmd in $asynch_cmds; do # For each to in thhe list
23
24 pid=${asynch_cmd%:*} # Task's PID
25 fd=${asynch_cmd#*:} # Task's FD
26
27 if ! kill -0 $pid 2>/dev/null; then # If the task ended
28
29 wait $pid # Retrieving the task's exit code
30 echo -n "$pid $fd $? "
31
32 echo "$(cat <&$fd)" # Retrieving the task's output
33
34 asynch_cmds=${asynch_cmds/$asynch_cmd /} # Removing the task from the list
35 fi
36 done
37 done
The output tells me that wait
fails trying to retrieve the exit code of each tasks, except the last one to be run:
Asynchronous commands:
PID FD
4348 10
4349 11
4351 12
4353 13
4355 14
4357 15
4359 16
4361 17
4363 18
4365 19
Exit codes and outputs:
PID FD EXIT OUTPUT
./bg.sh: line 29: wait: pid 4348 is not a child of this shell
4348 10 127 16010
./bg.sh: line 29: wait: pid 4349 is not a child of this shell
4349 11 127 8341
./bg.sh: line 29: wait: pid 4351 is not a child of this shell
4351 12 127 13814
./bg.sh: line 29: wait: pid 4353 is not a child of this shell
4353 13 127 3775
./bg.sh: line 29: wait: pid 4355 is not a child of this shell
4355 14 127 2309
./bg.sh: line 29: wait: pid 4357 is not a child of this shell
4357 15 127 32203
./bg.sh: line 29: wait: pid 4359 is not a child of this shell
4359 16 127 5907
./bg.sh: line 29: wait: pid 4361 is not a child of this shell
4361 17 127 31849
./bg.sh: line 29: wait: pid 4363 is not a child of this shell
4363 18 127 28920
4365 19 10 28810
The output of the commands is flawlessly retrieved, but I don't understand where this is not a child of this shell
error comes from. I must be doing something wrong, as wait
is able to get the exit code of the last command to be run asynchronously.
Does anyone know where this error comes from? Is my solution to this problem flawed, or am I misunderstanding the behavior of bash? I'm having a hard time understand the behavior of wait
.
P.S: I posted this question on Super User, but on second thought, it might be better suited to the Unix & Linux Stack Exchange.
asynch_cmds
(your script is already using and assuming a lot of bashisms, anyways). Simple demo:typeset -A ar; foo='x y'; ar[3+4]=8; ar[$foo]=bar; typeset -p ar; unset ar[3+4]; echo "ar has ${#ar[@]} element(s)"
. – Sep 13 '19 at 20:13