1

I have a file processes.txt as follows:

process_1.sh
process_2.sh
process_3.sh

I will run the following on the terminal:

cat processes.txt | xargs -L1 -P3 sh

How would this be different if I had a script executable_processes.sh as follows:

process_1 &
process_2 &
process_3 &
wait

where process_1, process_2 and process_3 are scripts with execute permissions but performing the same tasks as process_1.sh, process_2.sh and process_3.sh, and I ran the following on the terminal:

sh executable_processes.sh

My colleague told me that the first example (using xargs -L1 -P3) runs the three processes "truly in parallel", while the second example (sh executable_processes.sh) that uses & runs the three processes as background processes but still sends them to the background consecutively, i.e. process_1 is sent to the background first, next process_2 is sent to the background and then process_3 is sent to the background. Therefore he prefers that I use the first example. But my problem with the first approach is that I do not know how to use xargs -L1 -P3 if the three lines in processes.txt were as follows:

cat input_1.txt | process_1
cat input_2.txt | process_2
cat input_3.txt | process_3

Let's say I modified the file processes.txt as follows:

input_1.txt | process_1
input_2.txt | process_2
input_3.txt | process_3

And then run

cat processes.txt | xargs -L1 -P3 cat

This throws the error

cat: '|': No such file or directory

I want to be able to run cat input_1.txt first and the pipe the output to process_1 and so on. But input_1.txt, | and process_1 are being handled as arguments to cat

If using & and xargs -P are not different from each other would it be more reasonable to simply run a script that contains:

cat input_1.txt | process_1 &
cat input_2.txt | process_2 &
cat input_3.txt | process_3 &
sriganesh
  • 101
  • 1
  • 7

1 Answers1

0

Background processes run by the shell all run in parallel; you can verify this by running

sleep 10 & sleep 10 & sleep 10 & wait

You’ll wait for ten seconds (or just a little more), not thirty.

So

process_1 < input_1.txt &
process_2 < input_2.txt &
process_3 < input_3.txt &

or

for i in {1..3}; do "process_$i" < "input_$i.txt" & done

will start all three processes, with three different inputs, in parallel.

If you wanted to do something similar with xargs, you could use -I:

printf "%s\n" {1..3} | xargs -I{} -P3 sh -c "process_{} < input_{}.txt"
Stephen Kitt
  • 434,908
  • To more specifically address your colleague's assertion that xargs -P does things "truly in parallel"- even modern CPUs do literally nothing in actual simultaneity. While there is an extremely small difference between command & command & command & and xargs -P sh 'command', xargs will still at the end of the day be sending one instance of command at a time into the background to be executed. – DopeGhoti Jul 13 '22 at 14:07
  • I ultimately used the approach with cat processes.txt | xargs -L1 -P3 sh -c with every line of processes.txt within quotes. I have been asked to keep this approach instead of using & at the end of each line. – sriganesh Jul 14 '22 at 07:26
  • @DopeGhoti "modern CPUs do literally nothing in actual simultaneity" - I'm heading off-topic here but are you really saying that a four core CPU doesn't execute four (compute-bound) processes simultaneously? – Chris Davies Jul 14 '22 at 10:19
  • Dig deep enough and things are indeed sequential. A four-core processor can execute four instructions per clock cycle but within that cycle the instructions are in series. – DopeGhoti Jul 14 '22 at 13:27
  • @DopeGhoti no, a four-core processor executes instructions in parallel, there’s nothing sequential (apart from, perhaps, serialisation with some accesses to the shared cache). Even within a single core some instructions can run in parallel (across different execution units). – Stephen Kitt Jul 14 '22 at 13:39