As of November 2010, Linux is used on 459 out of the 500 supercomputers of the TOP500. Refer to the table via Internet Archive.
What are the reasons behind this massive use of Linux in the supercomputer space?
As of November 2010, Linux is used on 459 out of the 500 supercomputers of the TOP500. Refer to the table via Internet Archive.
What are the reasons behind this massive use of Linux in the supercomputer space?
I work in the HPC industry.
If you're asking why most people today use Linux on their cluster, it's what you listed in your question: more than 90% of the biggest clusters run Linux. It's the de-facto standard - almost any cluster library, tool or application is ready-to-run on Linux. It is more work to setup a cluster using any other operating system.
If you're asking how Linux became the de-facto standard, then Caleb has the answers ;)
For almost any question of the form: "Why is x the predominant choice in the y market segment?" the answers cluster around two factors.
At some critical juncture during the emergence and growth of that market segment or niche the product in question had some advantages in cost and features which encouraged its adoption by a critical mass. Once that critical mass has been achieved then all of the ancillary products for that segment will support it and all of the key personnel in that industry/niche will be familiar with it as the premier choice.
At some point back in the '90s Donald Becker released some code and information regarding the Beowulf cluster that he and Thomas Sterling had built for a project at NASA. This used commodity hardware, running Linux and incorporating the MPI (message passing interface) and PVM (parallel virtual machine) libraries for distribution of computational tasks across a network of nodes.
At the time the alternatives required much more expensive hardware (mostly Sun workstations), had proprietary software licensing with per/node or per/CPU costs, and typically were closed source or had significant closed source components.
Thus Linux had advantages in all three of these factors. That Becker released some code and documentation (and did so under a cool name) gave Linux a tremendous boost in credibility for that sort of supercomputing application. (That it was used by a project at NASA was also a huge boost to its credibility).
From there colleges and universities picked up the approach for their own labs. Within a couple years after that an entire generation of scientists were familiar with Beowulf clusters and a wide array of tools were readily available to support many applications across them.
One more reason. In the old days for serious work there were no Linux, no Windows, but UNIX and VMS (MSDOS and similar were not contenders, they lacked too many features), and maybe few less known things like lisp machines...
Of those, only UNIX-derived platforms survived. And Linux was a cheap alternative for UNIX-like OSes: more-or-less compatible, open source and free. This made it possible to reuse scientific software that was written before Linux.