82

I want to compile as fast as possible. Go figure. And would like to automate the choice of the number following the -j option. How can I programmatically choose that value, e.g. in a shell script?

Is the output of nproc equivalent to the number of threads I have available to compile with?

make -j1 make -j16

tarabyte
  • 4,296

6 Answers6

95

nproc gives the number of CPU cores/threads available, e.g. 8 on a quad-core CPU supporting two-way SMT.

The number of jobs you can run in parallel with make using the -j option depends on a number of factors:

  • the amount of available memory
  • the amount of memory used by each make job
  • the extent to which make jobs are I/O- or CPU-bound

make -j$(nproc) is a decent place to start, but you can usually use higher values, as long as you don't exhaust your available memory and start thrashing.

For really fast builds, if you have enough memory, I recommend using a tmpfs, that way most jobs will be CPU-bound and make -j$(nproc) will work as fast as possible.

Stephen Kitt
  • 434,908
  • 3
    and ccache for later rebuild but this is OT – solsTiCe Jun 09 '15 at 21:54
  • 1
    Would using something like GNU parallel be worthwhile here? – terdon Jun 09 '15 at 22:52
  • If I use a tmpfs, will I be limited to a directory size that always smaller than my physical RAM size? – tarabyte Jun 10 '15 at 00:47
  • 2
    It's not a great answer, but in the strict spirit of the question of programmatically determining the fastest "j" value, you could loop j from 1 to some reasonable upper limit (2x nproc??) and wrap the make in a time call. Clean up the results, lather rinse repeat-- and end up sorting the times/j values. – Jeff Schaller Jun 10 '15 at 01:09
  • 3
    @terdon No. Make is all about resolving dependencies, which means the jobs still have to be run in a certain order. GNU parallel doesn't care about that. On a side note, deciding which jobs are safe to run in parallel and which aren't is a hard problem. All make programs that offered parallel builds took years until they became somewhat usable. – lcd047 Jun 10 '15 at 05:18
  • @tarabyte Yes, which is why it's only viable if you have enough memory: you need enough space for the build on disk and for its memory use. I generally use build machines with 32GB of RAM... – Stephen Kitt Jun 10 '15 at 08:02
31

The most straight-foward way is to use nproc like so:

make -j`nproc`

The command nproc will return the number of cores on your machine. By wrapping it in the ticks, the nproc command will execute first, return a number and that number will be passed into make.

You may have some anecdotal experience where doing core-count + 1 results in faster compile times. This has more to do with factors like I/O delays, other resource delays and other availability of resource constraints.

To do this with nproc+1, try this:

make -j$((`nproc`+1))
101010
  • 763
8

Unfortunately even different portions of the same build may be optimal with conflicting j factor values, depending on what's being built, how, which of the system resources are the bottleneck at that time, what else is happening on the build machine, what's going on in the network (if using distributed build techniques), status/location/performance of the many caching systems involved in a build, etc.

Compiling 100 tiny C files may be faster than compiling a single huge one, or viceversa. Building small highly convoluted code can be slower than building huge amounts of straight-forward/linear code.

Even the context of the build matters - using a j factor optimized for builds on dedicated servers fine tuned for exclusive, non-overlapping builds may yield very dissapointing results when used by developers building in parallel on the same shared server (each such build may take more time than all of them combined if serialized) or on servers with different hardware configurations or virtualized.

There's also the aspect of correctness of the build specification. Very complex builds may have race conditions causing intermittent build failures with occurence rates that can vary wildly with the increase or decrease of the j factor.

I can go on and on. The point is that you have to actually evaluate your build in your very context for which you want the j factor optimized. @Jeff Schaller's comment applies: iterate until you find your best fit. Personally I'd start from the nproc value, try upwards first and downwards only if the upwards attempts show immediate degradation.

Might be a good idea to first measure several identical builds in supposedly identical contexts just to get an idea of the variability of your measurements - if too high it could jeopardise your entire optimisation effort (a 20% variability would completely eclipse a 10% improvement/degradation reading in the j factor search).

Lastly, IMHO it's better to use an (adaptive) jobserver if supported and available instead of a fixed j factor - it consistently provides a better build performance across wider ranges of contexts.

  • 1
    well put regarding the dependencies of the underlying build. can you comment on passing no fixed number with the -j parameter? e.g. make -j – tarabyte Jul 03 '15 at 04:51
  • 7
    make -j will spawn as many jobs as the dependencies allow like a fork bomb (http://superuser.com/questions/927836/how-to-deal-with-a-memory-leaking-fork-bomb-on-linux/927967#927967); the build will crawl at best spending most CPU on managing the processes than running them (http://superuser.com/questions/934685/how-to-find-what-is-pegging-cpu-in-linux-kernel/934839?noredirect=1#comment1269087_934839) and in highly parallel builds the system will run out of memory/swap or pid #s and the build will fail. – Dan Cornilescu Jul 03 '15 at 11:06
2

If what you need is the number of cores of the processor -1, and lscpu command not found. You can use this:

make -j$(grep processor /proc/cpuinfo | tail -n 1 | awk '{print $3}')
bo0k
  • 21
  • 2
    Welcome to the site and thank you for your contribution. You may want to edit yout post to explain why you assume that the OP is looking for the number of CPUs -1 (or why that is more straightforward to obtain), and what to do if the actual number of CPUs is wanted instead. – AdminBee Feb 17 '21 at 10:44
1

If you'd like to write make command to use as many parallel workers as you've got virtual CPUs, I suggest to use:

nproc | xargs -I % make -j%

Which can be written as either a standalone command or as RUN directive within Dockerfile (as Docker doesn't support nested commands)

0
lscpu | grep "^CPU(" | awk '{print $2}'

This gets you the number of your CPU cores.

Lynne
  • 109
  • 3
  • 3
    nproc is simpler and more common (on Linux systems). – Stephen Kitt Dec 11 '19 at 17:09
  • 1
    nproc does not return the number of CPUs. Per the man page:

    Print the number of processing units available to the current process, which may be less than the number of online processors

    – gerardw Jun 29 '22 at 12:38
  • @gerardw fair enough, but the value returned by nproc is more useful than the actual number of CPUs — if the current process (and presumably its children) has access to fewer CPUs than are installed in the system, it only cares about the CPUs it actually has access to. – Stephen Kitt Aug 28 '23 at 13:32