[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Monitor long running processes and kill them
From: |
Martin Pala |
Subject: |
Re: Monitor long running processes and kill them |
Date: |
Mon, 26 Mar 2012 22:14:24 +0200 |
On Mar 26, 2012, at 9:16 PM, Tom H wrote:
> On 03/26/2012 01:27 PM, Martin Pala wrote:
>> Hi,
>>
>> if the process does have a pidfile, it is usually updated only when the
>> process starts - in such case you can use the timestamp test to restart the
>> process, for example:
>> --8<--
>> check file myprocpid with path /var/run/myproc.pid
>> if timestamp> 3 days then exec "/etc/init.d/myproc restart"
>> --8<--
>>
>>
>> We can easily implement uptime test to Monit in the future, so it can be
>> possible to use somethig like:
>> --8<--
>> check process myproc with pidfile /var/run/myproc.pid
>> start program = …
>> stop program = ...
>> if uptime> 3 days then restart
>> --8<--
>>
>
> Hi,
>
> These are interesting strategies, as I have some similar requirement.
>
> For my boxes which are yum based often I see a hung updates yum process that
> is a few days old, so I am looking for something to go and get those, as they
> cause a miss on the next run of yum in cron.
>
> However the outputs of "ps -ef" and "ps aux" are not entirelely trivial to
> parse, as the "STIME" field seems to rollover from a "00:00" bare time of
> day, to a "MMMDD" format after 24 hours...
>
> Cheers,
> Tom
I have implemented the uptime test and it will be part of the next release … it
will be possible to use the pattern based process test with the uptime, for
example:
check process yum matching "yum"
if uptime > 1 day then exec "/usr/bin/pkill -9 yum"
=> in the case that process "yum" with uptime of more then 1 day will be found,
monit will kill it. You can tune/test the pattern on the CLI with "monit
procmatch <pattern>"
You can get the development snapshot here:
http://www.mmonit.com/tmp/monit-5.4_a1.tar.gz
Regards,
Martin