monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit "check program" reap interval


From: Leif Gustafson
Subject: Re: monit "check program" reap interval
Date: Thu, 14 Jun 2012 17:21:37 -0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20120312 Thunderbird/11.0

On 06/14/2012 5:02 PM, Jan-Henrik Haukeland wrote:
> On Jun 15, 2012, at 1:15 AM, Leif Gustafson wrote:
>
>> Hello,
>>  The documentation for the "check program" service test in monit
>> 5.3/5.4 indicates that a child process is reaped at the next cycle.  Is
>> this also the case if you have configured your "check program" test to
>> run every n cycles?  
> No, skip has precedence, so it will wait for n cycles before reaping the 
> process.
>
>> For instance, if I configure a script to be tested
>> every 5 cycles (assume a 60 second cycle interval), does monit execute
>> the script and check the result in 1 minute (on the very next cycle) or
>> 5 minutes (when the script is to be run again)?  
> It checks in 5 minutes
>
>> Also, does the TIMEOUT
>> N SECONDS clause impact when the child is reaped?
> No, only when the check is performed. At check time the test 'if 
> (execution_time > timeout)' is run to decide if the program timed out. Where 
> execution_time is computed as now - start time of program.
>
> Is there a problem with the "long" reap time? As noted in the documentation, 
> the caveat with the current design is that reaps are done cycle based and 
> exited child processes may be in a zombie state within that time frame, but 
> they _will_ be reaped. 
>
> Also as mentioned in the doc we are refactoring away the whole poll cycle 
> based engine and replace checks with an event loop and a thread-pool so 
> checks can be performed at individual time and with sub-seconds resolution. 
> This new design will also ensure that child-processes will be reaped when 
> they exit. 
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general

Thanks for the response.  The main issue I have with the behavior is
that it seems as if you don't get alerted in a timely fashion *if* you
skip cycles, and it seems to get worse the more cycles you skip.  Say I
configure a script to run once per day (because the script performs an
expensive operation that can't be run often)... I configure it to run
every 1440 cycles (or use the new cron-like functionality).  In this
scenario, I wouldn't be notified of problems until the process has been
reaped (approximately one day later).  At least, that's the issue that I
seem to be up against.

The new event-based engine sounds fantastic.  Until that happens, it
would be nice to be able to either control when the child gets reaped or
at least check once per cycle (still honoring the default or specified
timeout) so that we can still deal with problems in a timely fashion.


Thanks,
Leif Gustafson



reply via email to

[Prev in Thread] Current Thread [Next in Thread]