11

When specifying the option --jobs to GNU parallel, what exactly does it mean?

I execute:

parallel --jobs 10 ./program ::: {1..100}

where program is an intensive task, and the jobs are completely independent of each other. {1..100} represents symbolic inputs to each task. When I inspect the processes running on the PC, I find that many times there are less than 10 jobs running simultaneously.

So what exactly is --jobs specifying?

a06e
  • 1,727
  • If you can reproduce this, then it is a bug. You should see 9 jobs running just when next jobs starts. Other than that 10 jobs should be running. – Ole Tange Mar 22 '16 at 20:57

1 Answers1

7

As per the man page, --jobs is the maximum number of jobs that will run in parallel on each machine (emphasis mine):

--jobs N

Number of jobslots on each machine. Run up to N jobs in parallel. 0 means as many as possible.

It does not mean that it will always equal that. The first and foremost requirement for parallel computing is that jobs can be run independently and the final output can be combined such that it will produce the same output if the jobs are run sequentially. If this is not possible, the task cannot be done in parallel.

Also, from the GNU parallel man page:

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.

Now, if the file has only 2 lines, but you pass --jobs 10, parallel cannot run 10 jobs for 2 lines, since the smallest input that it takes is a line. So, you will only see 2 jobs.

This is not just the case with GNU parallel, but pretty much any parallel computation engine.

terdon
  • 242,166
Munir
  • 3,332
  • I made some edits to the question, to clarify that I have more than 10 jobs, and the jobs are all independent. – a06e Mar 22 '16 at 17:52
  • Hence I think there must be another limitation playing a role here... – a06e Mar 22 '16 at 20:03
  • 1
    From the man parallel_design page, "GNU parallel busy waits. This is because the reason why a job is not started may be due to load average, and thus it will not make sense to wait for a job to finish. Instead the load average must be checked again. Load average is not the only reason: --timeout has a similar problem. To not burn up too much CPU GNU parallel sleeps exponentially longer and longer if nothing happens, maxing out at 1 second." You should probably read the entire man parallel_design page to understand other such quirks. – Munir Mar 22 '16 at 20:29
  • So high load can be a reason. It makes sense, the workstation where I am seeing this behavior is usually under high load. – a06e Mar 22 '16 at 21:10
  • High load can only be the reason if you use --load. – Ole Tange Mar 23 '16 at 08:07