Your parallel
appears to be the GNU one, which is a perl
script that runs commands in parallel.
It tries very hard to tell what shell it is being invoked from so that the command that you pass to it is interpreted by that shell, but to do that it runs a new invocation of that shell in separate processes.
If you run:
bash-5.2$ env SHELLOPTS=xtrace PS4='bash-$$> ' strace -qqfe /exec,/exit -e signal=none parallel -j 0 append ::: {1..4}
execve("/usr/bin/parallel", ["parallel", "-j", "0", "append", ":::", "1", "2", "3", "4"], 0x7ffe5e848c90 /* 56 vars */) = 0
[...skipping several commands run by parallel during initialisation...]
[pid 7567] execve("/usr/bin/bash", ["/usr/bin/bash", "-c", "append 1"], 0x55a2615f03e0 /* 67 vars */) = 0
bash-7567> append 1
bash-7567> arr+=("$1")
[pid 7567] exit_group(0) = ?
[pid 7568] execve("/usr/bin/bash", ["/usr/bin/bash", "-c", "append 2"], 0x55a2615f03e0 /* 67 vars */) = 0
[pid 7568] exit_group(0) = ?
[pid 7569] execve("/usr/bin/bash", ["/usr/bin/bash", "-c", "append 3"], 0x55a2615f03e0 /* 67 vars */) = 0
bash-7568> append 2
bash-7568> arr+=("$1")
[pid 7569] exit_group(0) = ?
[pid 7570] execve("/usr/bin/bash", ["/usr/bin/bash", "-c", "append 4"], 0x55a2615f03e0 /* 67 vars */) = 0
bash-7569> append 3
bash-7569> arr+=("$1")
[pid 7570] exit_group(0) = ?
bash-7570> append 4
bash-7570> arr+=("$1")
exit_group(0) = ?
Where strace
shows what commands are executed by what process and the xtrace
option causes the shell to show what it does.
You'll see each bash shell appending an element to their own $arr
, and then exit, and of course their own memory space including their individual $arr
array is gone, the $arr
array is not automagically shared between all bash
shell invocations on your system.
In any case, running commands concurrently implies running them in different processes, so there's no way it can run those functions in the invoking shell, those functions will be run in new shell instances in separate processes and they will update the arr
variables of those shells, not the one of the shell you run parallel
from.
Given that bash has not builtin multithreading support, even if parallel
was an internal command of the shell or implemented as a shell function, it would still need to run the commands in separate processes each process having their own memory. You'll find that in:
append 1 & append 2 & append 3 & wait
Or:
append 1 | append 2 | append 3
The $arr
array of the parent shell is not modified either.
If you want to collect the result of each job started by parallel, you can do it via stdout or via files.
For instance:
#! /bin/bash -
do_something() {
output=$(
echo "$1: some complex computation or otherwise there would
be no point using GNU parallel and its big overhead"
)
# output the result NUL delimited.
printf '%s\0' "$output"
}
export -f do_something
readarray -td '' arr < <(
PARALLEL_SHELL=/bin/bash parallel do_something ::: {1..4}
)
typeset -p arr
(here telling parallel
which shell to use for it to avoid having to guess).
Note that parallel
stores the output of each shell in a temporary file and dumps them in order on stdout so you get the elements of the array in correct order.
arr=(whatever)
from you orarr+=(whatever)
and yes I normally useparallel
when downloading a bunch of large video files or just because I was bored of loops and wanted to try functional style instead. – Nickotine Jun 24 '23 at 13:45arr=( $(cmd) )
does split+glob which you generally need to tune before using and bash can't split on NULs, while itsreadarray
(same asmapfile
, but mapfile is a misnomer) can properly read a list of records into an array. – Stéphane Chazelas Jun 24 '23 at 13:52IFS=$'\n'
always set, does it only apply when using wildcards? – Nickotine Jun 24 '23 at 13:54arr=( $(echo '/*/'; echo '/???/') ); typeset -p arr
. that unquoted$(...)
undergoes splitting (which you want) and globbing (which you don't want), hence the split+glob name for that "operator" (or misfeature depending on PoV), and why when using it, you need to tune it (set$IFS
and enable or disable thenoglob
option) or use a proper shell with proper splitting operators such as zsh. – Stéphane Chazelas Jun 24 '23 at 14:01*
literally rather than as a wildcard? – Nickotine Jun 24 '23 at 14:05noglob
when you use$var
or$(cmd)
or$(( arith ))
unquoted in order to split those expansions (and don't want the*
,?
, and other wildcard operators in those to trigger filename generation which you almost never want). That's the What about when you do need the split+glob operator? section at Security implications of forgetting to quote a variable in bash/POSIX shells – Stéphane Chazelas Jun 24 '23 at 14:08