7

In this script, that pulls all git repositories:

#!/bin/bash

find / -type d -name .git 2>/dev/null | while read gitFolder; do if [[ $gitFolder == "/Temp/" ]]; then continue; fi if [[ $gitFolder == "/Trash/" ]]; then continue; fi if [[ $gitFolder == "/opt/" ]]; then continue; fi parent=$(dirname $gitFolder); echo ""; echo $parent; (git -C $parent pull && echo "Got $parent") & done wait echo "Got all"

the wait does not wait for all git pull subshells.

Why is it so and how can I fix it?

1 Answers1

17

The issue is that the wait is run by the wrong shell process. In bash, each part of a pipeline is running in a separate subshell. The background tasks belong to the subshell executing the while loop. Moving the wait into that subshell would make it work as expected:

find ... |
{
    while ...; do
        ...
        ( git -C ... && ... ) &
    done
    wait
}

echo 'done.'

You also have some unquoted variables.

I would get rid of the pipe entirely and instead run the loop from find directly, which gets rid of the need to parse the output from find.

find / -type d -name .git \
    ! -path '*/Temp/*' \
    ! -path '*/opt/*' \
    ! -path '*/Trash/*' \
    -exec sh -c '
    for gitpath do
        git -C "$gitpath"/.. pull &
    done
    wait' sh {} +

Or, using -prune to avoid even entering any of the subdirectories we don't want to deal with,

find / \( -name Temp -o -name Trash -o -name opt \) -prune -o \
    -type d -name .git -exec sh -c '
    for gitpath do
        git -C "$gitpath"/.. pull &
    done
    wait' sh {} +

As mentioned in comments, you could also use xargs to have greater control over the number of concurrently running git processes. The -P option (for specifying the number of concurrent tasks) used below is non-standard, as are -0 (for reading \0-delimited pathnames) and -r (for avoiding running the command when there's no input). GNU xargs and some other implementations of this utility have these options though. Also, the -print0 predicate of find (to output \0-delimited pathnames) is non-standard, but commonly implemented.

find / \( -name Temp -o -name Trash -o -name opt \) -prune -o \
    -type d -name .git -print0 |
xargs -t -0r -P 4 -I {} git -C {}/.. pull

I'm sure GNU parallel could also be used in a similar way, but since this is not the main focus of this question I'm not pursuing that train of thought.

Kusalananda
  • 333,661