Reset running process when certain ammount of memory is consumed

Question

I'm running a Hydra on Rasppberry PI. There were some problems with the program, but aside from these, there is a hidden memory leak in the program. The source is pretty big and really can't find the problem. Unfortunatelly, upon reaching memory limit, the program doesn't crash - instead it returns bunch of error messages. When I say bunch, I mean hundreds. enter image description here

So I thought the if I can't un-allocate the memory within the program, I might need to reset whole process. So I need to:

Guard the process resource usage
Stop the process gracefully (similar to Ctrl+C, the program says "received signal 2" then)
Start the process again

I must do this until fix the program to die on errors - or not to produce them in the first place.

If you know hydra and you're curious about the errors I've found at least something in the code:

[ERROR] Fork for children failed: Cannot allocate memory
[ERROR] socketpair creation failed: Too many open files

The second part of the errors comes from perror C system function. It's sort of last error.

slm · Accepted Answer · 2014-01-16T04:48:01.043

#1 - God

Here's an idea using the process monitoring framework God. This application is written in Ruby but can be used to watch other processes, and guard against them doing things, such as dying or, in your case, use up too much RAM.

Ruby setup

Assuming you have Ruby installed -- you can use rvm (aka. Ruby Version Manager) to do this if you don't, but it will need to be installed and/or run as root. This is a requirement of god. You could also just install Ruby from your distro's repositories if it's available.

God setup

With a working Ruby installation you install the God gem like so:

$ [sudo] gem install god

Example

You can then use this simple God config to do what you want.

# /path/to/simple.god
God.watch do |w|
  w.name = "hydra"
  w.start = "<command to run hydra>"
  w.keepalive(:memory_max => 150.megabytes,
              :cpu_max => 50.percent)
end

Then invoke it like this:

$ god -c /path/to/simple.god -D

Now if Hydra exceeds either the CPU utilization or the memory used, God will restart it. NOTE: By default these properties will be checked every 30 seconds and will be acted upon if there is an overage for three out of any five checks.

Going further

Take a look at the documentation on God's website. The above example is from there and they do a much more thorough job of covering the details.

#2 - Process Resouce Monitor

Another alternative is Process Resource Monitor. The feature list shows that it can monitor per process resources.

per-process/per-user rule based resource limits

excerpt of description

Process Resource Monitor (PRM) is a CPU, Memory, Processes & Run (Elapsed) Time resource monitor for Linux & BSD. The flexibility of PRM is achieved through global scoped resource limits or rule-based per-process / per-user limits. A great deal of control of PRM is maintained through a number of ignore options, ability to configure soft/hard kill triggers, wait/recheck timings and to send kill signals to parent/children process trees. Additionally, the status output is very verbose by default with support for sending log data through syslog.

Example

To monitor Hydra we could create a rule file like this, /usr/local/prm/rules//hydra.cmd:

IGNORE=""
MAX_CPU="50"
MAX_MEM="150"
MAX_PROC="0"
# we dont care about the process run time, set value 0 to disable check
MAX_ETIME="0"
KILL_TRIG="3"
# we want to set a bit longer soft rechecks as sometimes the problem fixes
# itself
KILL_WAIT="20"
KILL_PARENT="1"
KILL_SIG="9"
KILL_RESTART_CMD="/etc/init.d/hydra restart"

prm runs via cron, /etc/cron.d/prm on 5 minute intervals. According to the docs this should probably be left alone.

I like this, I like this very much. – Tomáš Zato Jan 16 '14 at 10:35 — Tomáš Zato, Jan 16 '14 at 10:35

Reset running process when certain ammount of memory is consumed

1 Answers1

#1 - God

Ruby setup

God setup

Example

Going further

#2 - Process Resouce Monitor

Example

Linked