monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Is it possible to temporarily time out?


From: Callum Macdonald
Subject: Re: Is it possible to temporarily time out?
Date: Fri, 01 Feb 2013 22:30:14 +0700

Just read over this section of the man page:

Here's a more advanced example for monitoring an apache web-server
listening on the default port number for HTTP and HTTPS. In this example
Monit will restart apache if it's not accepting connections at the port
numbers. The method Monit use for a process restart is to first execute
the stop-program, wait up to 30s for the process to stop and then
execute the start-program and wait up to 30s for it to start. The length
of start or stop timeout can be overridden using the 'timeout' option.
If Monit was unable to stop or start the service a failed alert message
will be sent if you have requested alert messages to be sent.

check process apache with pidfile /var/run/httpd.pid
       start program = "/etc/init.d/httpd start" with timeout 60 seconds
       stop program  = "/etc/init.d/httpd stop"
       if failed port 80 then restart
       if failed port 443 with timeout 15 seconds then restart

First, there is no "restart program" at least not yet, as far as I know.
A restart is a stop then a start, so services with dependencies can be
restarted in the correct order.

Second, as I read the man page, the timeout will only mean that monit
waits for that length of time for the start or the stop action to
complete. But as soon as it does complete, the timeout is then ignored.

So as I understand it, the suggestion to add a timeout to the restart
command won't work...

Cheers - Callum.

On Fri, 2013-02-01 at 22:21 +0700, Callum Macdonald wrote:
> Interesting, I didn't think the exec timeout worked like that.
> 
> I guess it would be an option, although it doesn't do exactly what I
> want. In some failure scenarios, I can imagine I would want apache
> restarted many times in a short period, but if that's not working, then
> at some point I want to pause for a while, 15 minutes maybe, then maybe
> an hour. But a timeout or unmonitor action is permanent, it never gets
> reverted, and the monit email looks just like the rest, so it doesn't
> communicate importance...
> 
> The timeout option is an interesting idea, thanks for sharing. I'll muse
> on that somewhat before I take any further action.
> 
> I'll try to remember to post back on the list if I do settle on a
> solution.
> 
> Cheers - Callum.
> 
> On Fri, 2013-01-25 at 02:42 -0800, Michael Johnson - MJ wrote:
> > I don't know of a way to do eactly what you are asking, but I have
> > suggestion that might be suitable for your situation.  You could
> > simply delay attempts between restarts to make it never try to restart
> > once every 15 minutes.  To do this, you would simply add 'with timeout
> > 900' to the end of your 'restart = /etc/init.d/apache2' line (or what
> > every your restart line is.  This will cause apache to still get
> > checked every cycle, but never try to restart more than once every 15
> > minutes.
> > 
> > 
> > Like I said, it's not exactly what you were asking for, but it seems
> > like it might serve your purpose.
> > 
> > 
> > 
> > 
> > On Thu, Jan 24, 2013 at 4:14 AM, Callum Macdonald
> > <address@hidden> wrote:
> >         We had a network outage on one of our servers this week. The
> >         result was
> >         that when monit tried it's usual HTTP check to the server's
> >         IP, the
> >         check failed. Monit then tried to restart apache, but the
> >         check still
> >         failed. Monit tried a few times and then hit the timeout wall.
> >         We have a
> >         "if X restarts in X checks, timeout" rule.
> >         
> >         I'm realising it's probably a bad thing to be constantly
> >         restarting
> >         apache, but also probably worse to timeout on a production
> >         machine.
> >         
> >         I'm wondering if there's a way we can timeout for say 15
> >         checks. So
> >         instead of unmonitoring the service altogether, just unmonitor
> >         it for a
> >         while, and then monitor it again and repeat.
> >         
> >         I guess I could hack something up to this effect using cron
> >         and monit
> >         status / monit monitor, but wondered if there was such a
> >         feature in
> >         monit already, or plans to implement something like that.
> >         
> >         Cheers - Callum.
> >         
> >         
> >         
> >         --
> >         To unsubscribe:
> >         https://lists.nongnu.org/mailman/listinfo/monit-general
> > 
> > 
> > 
> > 
> > -- 
> > Michael Johnson - MJ
> > --
> > To unsubscribe:
> > https://lists.nongnu.org/mailman/listinfo/monit-general
> 
> 
> 
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general





reply via email to

[Prev in Thread] Current Thread [Next in Thread]