[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[monit] sending a SIGKILL from monit for handling stale mongrel pids
From: |
Michael Steinfeld |
Subject: |
[monit] sending a SIGKILL from monit for handling stale mongrel pids |
Date: |
Tue, 16 Oct 2007 13:28:48 -0400 |
So maybe I am a complete idiot... but here is what I have been pondering
Every once in awhile it seems that monit will attempt to restart
mongrels if it meets the specificied criteria.. CPU to high/long, to
much RAM .. etc
What happens is monit will attempt to restart mongrels, but the pids
are not dying. Even if I do, "monit -g group stop all" and wait...
they don't die. Even attempting to stop the process by itself doesn't
work. So I have to send a SIGKILL
(I have not been able to figure out what is causing this )
So.. I was thinking to have monit send a SIGKILL if 5 cycles doesn't
solve the issue.
#my monit service for mongrels
check process mongrel_9006
with pidfile /var/run/mongrel_cluster/mongrel.9006.pid
start program = "/usr/local/bin/mongrel_rails cluster::start -C
/etc/mongrel_cluster/mongrel_cluster.yml --clean --only 9006"
stop program = "/usr/local/bin/mongrel_rails cluster::stop -C
/etc/mongrel_cluster/mongrel_cluster.yml --clean --only 9006"
if totalmem is greater than 110.0 MB for 3 cycles then
restart # eating up memory?
if loadavg(5min) greater than 10 for 8 cycles then
restart # bad, bad, bad
if cpu is greater than 50% for 2 cycles then
alert # send an email to admin
if cpu is greater than 80% for 3 cycles then
restart
if 10 restarts within 10 cycles then
timeout
Instead of ..
<snip>
if cpu is greater than 50% for 2 cycles then
alert # send an email to admin
if cpu is greater than 80% for 3 cycles then
</snip>
do this ...
<snip>
if cpu is greater than 50% for 2 cycles then
alert # complain about it
if cpu is greater than 80% for 5 cycles then
sigkill
sleep 5 # enough time to kill all 8 mongrel pids
start_fresh
</snip>
#so it would look like this... you get the idea.
#my monit service for mongrels
check process mongrel_9006
with pidfile /var/run/mongrel_cluster/mongrel.9006.pid
start program = "/usr/local/bin/mongrel_rails cluster::start -C
/etc/mongrel_cluster/mongrel_cluster.yml --clean --only 9006"
stop program = "/usr/local/bin/mongrel_rails cluster::stop -C
/etc/mongrel_cluster/mongrel_cluster.yml --clean --only 9006"
kill_the_bastard = "kill -9 <pid>" # hmpf...
if totalmem is greater than 110.0 MB for 3 cycles then
restart # eating up memory?
if loadavg(5min) greater than 10 for 8 cycles then
restart # bad, bad, bad
if cpu is greater than 50% for 2 cycles then
alert # complain about it
if cpu is greater than 80% for 5 cycles then
kill_the_bastard
# I am assuming that if it is killed, then monit will start it
if 10 restarts within 10 cycles then
timeout
so question, does 'timeout' actually send a SIGTERM/SIGHUP to the
proccess, or does it just execute the stop command for that particular
service?
how are you guys handling stale pids with monit? In the case that
executing stop/restart doesn't work?
Is what I am suggesting even possible?
--
Michael Steinfeld
Linux Admin/Developer
GTALK: address@hidden
- [monit] sending a SIGKILL from monit for handling stale mongrel pids,
Michael Steinfeld <=