[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] Running an external script with EXEC on a timeout
From: |
Patrick Burns |
Subject: |
[PATCH] Running an external script with EXEC on a timeout |
Date: |
Sat, 26 Feb 2005 16:35:22 +1100 |
I need this feature so much I'm trying to put it in myself. I'm
intending to use monit to look after the nodes in my Heartbeat clusters.
If something on a node fails (E.g. Apache goes down) monit can try to
restart it. However if a number of restarts are unsuccessful, it would
be good to have the node gracefully leave the cluster and initiate a
fail-over.
I've got this in /etc/monitrc:
---
set daemon 10
set alert address@hidden
check process foo with pidfile /tmp/foo
if 3 restarts within 5 cycles then exec /tmp/bar
---
/tmp/bar just contains:
---
#!/bin/bash
echo Hello World
---
Output looks like this (edited for brevity):
---
mail:~# monit -I -v -c /etc/monitrc
Runtime constants:
(removed)
The service list contains the following entries:
Process Name = foo
Group = (not defined)
Pid file = /tmp/foo
Monitoring mode = active
Timeout = If 3 restart within 5 cycles then exec else if
recovered then alert
-------------------------------------------------------------------------------
Starting monit daemon
'foo' process is not running
Does not exist notification is sent to address@hidden
monit: Start or stop method not defined -- process foo
'foo' process is not running
monit: Start or stop method not defined -- process foo
'foo' process is not running
monit: Start or stop method not defined -- process foo
'foo' service timed out and will not be checked anymore
Timeout notification is sent to address@hidden
Monitoring disabled -- service foo
Hello World
^C
monit daemon with pid [3155] killed
You have new mail in /var/mail/patrickb
---
You can see the exec worked, as it printed "Hello World" to the console
after the service timed out.
If I can exec any arbitrary command after a timeout, there's no reason
why I can't put in "/etc/init.d/heartbeat stop" to cause the node to
give up it's resources if a service looks terminally broken. (Assuming
the error hasn't propagated to the other node in the cluster as well...)
Patch attached...
--
Patrick Burns
address@hidden
timeout.patch
Description: Binary data
- [PATCH] Running an external script with EXEC on a timeout,
Patrick Burns <=