I have a file which has 2,000,000
lines in it. I am running some commands for each line. I was trying to achieve some parallelism using GNU-parallel
and swift
as discussed here. However, I got an interesting idea from one of my friends.
He was suggesting to spawn multiple processes in the server since the server is pretty powerful. I was thinking if I use an index for each line of file, I could spawn multiple processes based on the totallines mod number_of_processes
.
For example, if line_numbers are 1,11 and 21, it will be sent to first process and if line numbers are 2,12 and 22 it will be sent to second process so on.
To achieve the above, I was going through background processes in shell scripting. In most of the tutorials/links, they are appending an &
to the command and telling that a background process will be spawned by the computer. I am finding it little difficult to understand this concept.