2

Possible Duplicate:
Four tasks in parallel… how do I do that?

Suppose a loop invoking a command

 grep -v '#' < files.m3u | sed 's/\\\\/\/\//g' | sed 's/\\/\//g' | while read line
 do
    filename=$(basename "$line")
    avconv -i "$line" "${filename%.*}.wav"
 done

Putting & after avconv will keep spawning avconv for each file. Now I want to do two things:

  • I want to limit the number of processes spawned to 4
  • When the loop is done I want to wait for the last one to be ready

3 Answers3

3

You can remember the PID of each new child (check $! after starting it). Periodically check how many children still exist (e.g. by kill -0), if the number goes down, spawn a new one, etc. At the end, just wait.

Here is a script I wrote for the same reason:

#! /bin/bash

## Tries to run commands in parallel. Commands are read from STDIN one
## per line, or from a given file specified by -f.
## Author: E. Choroba

file='-'
proc_num=$(grep -c ^processor'\b' /proc/cpuinfo)
prefix=$HOSTNAME-$USER-$$
sleep=10

children=()
names=()

if [[ $1 =~ ^--?h(elp)?$ ]] ; then
    cat <<-HELP
    Usage: ${0##*/} [-f file] [-n max-processes] [-p tmp-prefix] -s [sleep]
      Defaults:
        STDIN for file
        $proc_num for max-processes (number of processors)
        $prefix for tmp-prefix
        $sleep for sleep interval
    HELP
    exit
fi

function debug () {
    if ((DEBUG)) ; then
        echo "$@" >&2
    fi
}

function child_count () {
    debug Entering child_count "${children[@]}"
    child_count=0
    new_children=()
    for child in "${children[@]}" ; do
        debug Trying $child
        if kill -0 $child 2>/dev/null ; then
            debug ... exists
            let child_count++
            new_children+=($child)
        fi
    done

    children=("${new_children[@]}")
    echo $child_count
    debug Leaving child_count "${children[@]}"
}

while getopts 'f:n:p:s:' arg ; do
    case $arg in
        f ) file=$OPTARG ;;
        n ) proc_num=$((OPTARG)) ;;
        p ) prefix=$OPTARG;;
        s ) sleep=$OPTARG;;
        * ) echo "Warning: unknown option $arg" >&2 ;;
    esac
done

i=0
while read -r line ; do
    debug Reading $line
    name=$prefix.$i
    let i++
    names+=($name)

    while ((`child_count`>=proc_num)) ; do
        sleep $sleep
        debug Sleeping
    done

    eval $line 2>$name.e >$name.o &
    children+=($!)
    debug Running "${children[@]}"
done < <(cat $file)

debug Loop ended
wait
cat "${names[@]/%/.o}"
cat "${names[@]/%/.e}" >&2
rm "${names[@]/%/.o}" "${names[@]/%/.e}"
choroba
  • 47,233
  • Beware the race conditions that come with the approach of "checking if a process is alive by PID". There is no good way to detect the case where the process exits and its PID gets recycled for a new process. A lock file in a secure directory that the child removes on exit would be more robust. In the end, I'd just say don't reinvent the wheel (unless your wheel is really better). – jw013 Aug 24 '12 at 04:31
2

From the linked question, tailored to your variation:

sed -n -e '/#/!s,\\,/,g' files.m3u | xargs -d '\n' -I {} -P 4 \
    sh -c 'line=$1; file=${line##*/}; avconv -i "$line" "${file%.*}.wav"' avconv_sh {}

Again, GNU xargs or some version supporting -d and -P is required. Also beware of extra spaces in the input file at the beginning and end of the line - this snippet will keep them in if they exist, which may cause problems.

jw013
  • 51,212
  • yes, xargs' parallel execution option is perfect for this. it's (unfortunately) a little-known feature of GNU xargs. Here are some useful links describing the feature: http://www.tummy.com/journals/entries/jafo_20100418_235041 and http://www.spinellis.gr/blog/20090304/index.html – cas Aug 25 '12 at 02:07
  • GNU parallel (https://www.gnu.org/software/parallel/) is also a useful tool for this kind of job. – cas Aug 25 '12 at 02:08
1

I solved it this way. Thanks for the $! tips

#!/bin/bash
children[0]=0
children[1]=0
children[2]=0
children[3]=0

i=0
grep -v '#' < files.m3u | sed 's/\\\\/\/\//g' | sed 's/\\/\//g' | while read line
do
    filename=$(basename "$line")
    let k="$i%4"
    wait ${children[k]}
    avconv -i "$line" "${filename%.*}.wav" &
    children[k]=$!
    let i++
done

wait ${children[0]}
wait ${children[1]}
wait ${children[2]}
wait ${children[3]}