spawning multiple Background processes

Question

I have a file which has 2,000,000 lines in it. I am running some commands for each line. I was trying to achieve some parallelism using GNU-parallel and swift as discussed here. However, I got an interesting idea from one of my friends.

He was suggesting to spawn multiple processes in the server since the server is pretty powerful. I was thinking if I use an index for each line of file, I could spawn multiple processes based on the totallines mod number_of_processes.

For example, if line_numbers are 1,11 and 21, it will be sent to first process and if line numbers are 2,12 and 22 it will be sent to second process so on.

To achieve the above, I was going through background processes in shell scripting. In most of the tutorials/links, they are appending an & to the command and telling that a background process will be spawned by the computer. I am finding it little difficult to understand this concept.

score 2 · Answer 1 · answered Feb 14 '14 at 09:11

2

How does your idea differ from GNU Parallel's --pipe --round-robin?

seq 100 | parallel --pipe --round-robin -j10 -N 1 'echo Start;cat'

Doing it line by line is somewhat inefficient for GNU Parallel. Doing it block by block is more efficient:

seq 1000000 | parallel --pipe --round-robin -j10 'echo Start;cat'

Adjust --block to suit your needs.

answered Feb 14 '14 at 09:11

Ole Tange

35,514

I have installed gnu-parallel in my system and am running my script by typing "./script_name.sh | parallel --pipe --round-robin -j10 'cat' > output2. However, I am not able to see the output in my output2 file. – Ramesh Feb 14 '14 at 17:31
Do the examples above work? – Ole Tange Feb 15 '14 at 11:51
Yeah, it is working. – Ramesh Feb 16 '14 at 16:31

spawning multiple Background processes

1 Answers1