9

Having a system for collecting performance statistics can be extremely useful. In the past, I've used Munin for this, and it has been invaluable in analyzing bottlenecks and various other issues. I was recently made aware of collectd, which seems very similar to Munin.

What monitoring applications are available and should be considered (other than Munin and Collectd), and how do you choose which one to use?

Vetle
  • 193
  • 3
    Maybe you wanna split this question into two, one more generic, and one more specific: How do you choose which monitoring application to use? VS What are the differences between Munin and Collectd?. – tshepang Jan 13 '11 at 21:17

3 Answers3

6

Munin is a data collector and visualizer (grapher) tool. It is easy to set up. Easy to use, but it uses too much resources and does not scale well. The default collection interval is 5min and it is not easy to change that, because it will overload your machine and some plugin has problems if you do that anyway. The plugins are executed (forked) every time when data collection occurs which is expensive. It has networked setup. You have to set up a local server and node even if you use one machine.

Collectd is a data collector tool only. You can choose 3rd party soultions to graph the collected data, but it does not work out of the box. It has many plugins, mostly written as a C module which gets started once when you start the daemon. You can change the collection interval and you can get fine grained statistics. It can collect data locally or via the network.

ttyS0
  • 263
5

My favorite performance monitoring analysis tool is SGI's open source Performance CoPilot (PCP). For a single system, it might be overkill - for a set of enterprise systems, it's fantastic. PCP offers historical data, a networked configuration, and an alarm system which is incomparable in open source (or just about anywhere else).

Flux
  • 2,938
Mei
  • 1,136
  • +1 from me for mentioning PCP, however I strongly disagree on the 'overkill' part, because PCP has a minimal footprint on the monitored system by design, ie. I monitor Linux servers with Prometheus via PCP WebAPI and it takes ca. 100 nanoseconds to scrape 3-4k data samples. – mac13k Jul 02 '20 at 11:18
4

My opinion is that zabbix rules. It's powerful and user friendly. You don't need any plugin or third party in order to make it work.

You can find lots of other suggestions in this post at server fault.

  • Would my question be more appropriate for server fault? I.e. should it be moved or marked off topic? – Vetle Jan 14 '11 at 07:46
  • Well, it depends. If you intend to monitor OS other then Linux/Unix yes. If you have especific doubts about *nix, so this should be the place. But, at least search Server Fault. Since it's running for some time, you'll find lot's of good stuff over there. – Bob Rivers Jan 14 '11 at 16:40