17

For a study I am doing I was asked how many actual developers commit on a typical kernel version.

I know there is Linus Torvalds as the main developer, followed by many second main developers. Several of which work in companies. But here comes my doubts:

  1. Does a commit to the kernel from a company mean that many devs from that company worked to make that commit possible or was it just one man (The one that did the commit).

  2. Are there 3rd party groups that help companies or main devs?

  3. What could be the estimate of the total number of people involved in a particular version of the Kernel. Not only the total number of companies but the actual amount of people that contributed one way or another to the kernel.

dr_
  • 29,602
Luis Alvarado
  • 830
  • 1
  • 9
  • 20

2 Answers2

20

It could be interesting to clone the git repository of linux and query it immediately.

Cloning the repo

Beware it's a large file! (~1.5G)

Install git and run the following (in a new directory):

git clone http://github.com/torvalds/linux.git

Querying the repo

Once you've cloned it, you can analyze the log of commits with git log. Since the log is so long, you may want to limit your research to a smaller period of time:

git log <since>..<to>

for instance

git log v3.4..v3.5

This has theoretically a lot of info you could use. For example, that command prints the 20 most prolific committers along with their number of commits and their email address.

$ git log v3.4..v3.5 | grep Author | cut -d ":" -f 2 | sort | uniq -c | sort -nr | head -n 20
417  Linus Torvalds <torvalds@linux-foundation.org>
257  Greg Kroah-Hartman <gregkh@linuxfoundation.org>
196  Mark Brown <broonie@opensource.wolfsonmicro.com>
191  Axel Lin <axel.lin@gmail.com>
172  David S. Miller <davem@davemloft.net>
138  Daniel Vetter <daniel.vetter@ffwll.ch>
132  H Hartley Sweeten <hartleys@visionengravers.com>
128  Al Viro <viro@zeniv.linux.org.uk>
117  Stephen Warren <swarren@nvidia.com>
113  Tejun Heo <tj@kernel.org>
111  Paul Mundt <lethal@linux-sh.org>
104  Johannes Berg <johannes.berg@intel.com>
103  Shawn Guo <shawn.guo@linaro.org>
101  Arnd Bergmann <arnd@arndb.de>
100  Thomas Gleixner <tglx@linutronix.de>
 96  Eric Dumazet <edumazet@google.com>
 94  Hans Verkuil <hans.verkuil@cisco.com>
 86  Chris Wilson <chris@chris-wilson.co.uk>
 85  Sam Ravnborg <sam@ravnborg.org>
 85  Dan Carpenter <dan.carpenter@oracle.com>

The email address can give you an idea about the employers of the developers (google.com, cisco.com, oracle.com).

rahmu
  • 20,023
  • 1
    It's also possible to query the code history without having to clone it locally via the github api, e.g. here's the list of contributors https://api.github.com/repos/torvalds/linux/contributors (appears to be sorted by number of commits) – matt wilkie Apr 26 '13 at 15:59
  • according to this the command above only lists the current branch and one needs to use git log --all ... for a more comprehensive list – matt wilkie Apr 26 '13 at 16:01
  • 1
    Also note that this will list the contributors to the mainline kernel. There are forks/branches of the kernel that have been distributed that were built with patches from other developers, for example the kernel used in Android. – Peter L. Sep 12 '14 at 21:24
  • You probably want to count the number of non merge commits and you'll see that Linus is doing much less work than that. – Alexandre Belloni Oct 12 '15 at 08:49
14

Go to Kernel coverage at LWN.net and do a search for "Releases", and "Contributor statistics". Also do a search for "Who". There are various articles in that index with titles like (most recently) Who wrote 3.5.

While these articles may not directly answer your question, they are as detailed an answer as you are likely to find on the net, without trying to collect information first hand. In particular, they should provide at least a partial answer to 3.

The statistics gathering is done by gitdm (LWN article announcing it: gitdm v0.10 available). Thanks to vonbrand for pointing this out. The repository can currently (January 2015) be obtained with

git clone git://git.lwn.net/gitdm.git

As for 1 and 2, they are not so well defined. In the case of 1, I imagine the answer is almost certainly yes, some of the time. But it is not clear what you are looking for - anecdotal evidence, or some statistics. If statistics, in what form? In the case of 2, by "3rd party groups" it is not clear what you mean, and what kind of help you are referring to. Would people on an IRC channel count as a third party group, for example? Or are you talking about a more formal contractual relationship where money changes hands? Like an outside company retained for temporary consulting? In any case, such information would be hard to get without talking to kernel developers directly, and even then would likely be anecdotal. I suppose forums like the Linux kernel mailing list would be a possibility in that case.

Faheem Mitha
  • 35,108
  • Thank you. In the 3rd case I am looking for people that also helped somehow in making a commit. For example, a group of people that helped somebody in a company. This person in the company worked with a group which at the end made a commit to the kernel. – Luis Alvarado Aug 24 '12 at 16:11
  • The statistics gathering is done by gitdm (LWN article announcig it http://lwn.net/Articles/290957, latest commit is from Arpil 2012) – vonbrand Jan 16 '13 at 14:56