monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to check for a path in a url and restart runit if fails


From: David Montgomery
Subject: Re: How to check for a path in a url and restart runit if fails
Date: Thu, 12 Apr 2012 21:27:42 +0800

When I run apache ab in a t1.micro the site crashes.  This is what I
want.  Its a spawn-fcgi script under control of runit.  I am using
nginx and the unix socket is in /tmp/9002.socket.

But I also want to use monit to bring it back up.  When I go
http://mysite.com/pixel  in FF the page will not load. Even further
the pid still looks alive according to monit.  I get a green light.
When testing local host I still get a green light.  According to monit
the site is still good and the pid is still good so wit will not
restart.   I get no emails alerts and monit loves to send emails:)

I can now restart the service from monit.  But, monit will not
automatically restart the service.  I have tried the following.

check host localhost with address 127.0.0.1
    start program = "/bin/bash -c '/usr/bin/killall pixelServer &&
sleep 2 && /home/ubuntu/test.sh'"
    stop program = "/usr/bin/sv kill pixelServer"
    if failed url
         http://127.0.0.1:80/pixel
         and timeout 2 seconds
      then exec "/bin/bash -c '/home/ubuntu/test.sh'"

and

check process pixelServer
   with pidfile "/etc/sv/pixelServer/supervise/pid"
   start program = "/bin/bash -c '/usr/bin/killall pixelServer &&
sleep 2 && /home/ubuntu/test.sh'"
   stop program = "/usr/bin/sv kill pixelServer"
   if failed host 127.0.0.1 port 80 protocol http
   request /pixel
   then exec "/bin/bash -c '/home/ubuntu/test.sh'"

With out the bashscript and I try and restart...I have to restart
twice.  Thats with I have the bash script or at least kill first then
restart.

address@hidden:/etc/nginx$ sudo sv restart pixelServer
  timeout: run: pixelServer: (pid 16741) 334s, got TERM
  address@hidden:/etc/nginx$ sudo sv restart pixelServer
  ok: run: pixelServer: (pid 19349) 1s

I am sure that if I can at least figure out that the site is down then
things will work.

Thanks








On Thu, Apr 12, 2012 at 6:58 PM, Callum Macdonald
<address@hidden> wrote:
> Hola David,
>
> When you say "it breaks", what does that mean exactly?
>
> The service is not responsive on port 80? You get connection refused?
>
> As I understand monit (which is not very well!) the pid file checks that
> the program is running, and if not starts it. The host check is the one
> that should trigger a restart if the server is not functioning.
>
> If you run monit in debug mode, does the host check succeed while the
> pixelServer is broken? Could it be a timeout issue? Is it pixelServer
> that actually responds on port 80? There isn't an intermediary which
> might still be returning something that allows monit to believe the
> check succeeded?
>
> Best of luck with it.
>
> Cheers - Callum.
>
> On Thu, 2012-04-12 at 17:44 +0800, David Montgomery wrote:
>> Here is the latest code.
>>
>> check process pixelServer
>>    with pidfile "/etc/sv/pixelServer/supervise/pid"
>>    start program = "/bin/sh /home/ubuntu/test.sh"
>>    stop program = "/usr/bin/sv kill pixelServer"
>>    if failed host 127.0.0.1 port 80 protocol http
>>    request /pixel
>>    then exec "sh /home/ubuntu/test.sh"
>>
>> in the test.sh file I have
>> #!/bin/sh
>> /usr/bin/sv kill pixelServer
>> /usr/bin/sv start pixelServer
>>
>> When I slam the site..it breaks.  Looking at the logs sudo tail
>> -f /var/log/monit while the site is being hit there is no indication
>> that there is an issue.  In Monit I have a green light for hte pid
>> file.  The site is down.  When I run sudo test.sh manually the site is
>> back up.
>>
>> sudo sh test.sh
>> ok: run: pixelServer: (pid 26309) 617s
>>
>> After I manually restart I see this in the logs.
>> [UTC Apr 12 09:36:56] error    : 'pixelServer' process PID changed
>> from 26309 to 2863
>> [UTC Apr 12 09:37:10] info     : 'pixelServer' process PID has not
>> changed since last cycle
>>
>>
>> I can write my own python script and place it under runit control to
>> monitor the heath of pixel but would prefer monit.  I am sure  am
>> missing something but I just dont know what.  I suspect the issue is
>> the pid file monitoring.  Instead of pid, what should I use?  Is it
>> the socket issue? /tmp/nginx9002.socket?  Is that killed?
>>
>> Thanks
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Apr 12, 2012 at 3:57 PM, Eric PAILLEAU <address@hidden> wrote:
>>         Le 12/04/2012 09:06, David Montgomery a écrit :
>>         > Hi,  agreed about using restart as being more elegant. But I am 
>> having
>>         > and issues about restarting the runit service after I try and 
>> break it
>>         > under load using apache AB using a t1.micro.
>>         >
>>         > if I use the below..the service will never restart.  the 
>> pixelServer
>>         > is a python web.py script executed using spawn-fcgi.  I am using 
>> nginx
>>         > and unix sockets (unixsocket /tmp/nginx9002.socket).
>>         >
>>         > In monit...I still get a green light for the service as well
>>         >
>>         >
>>         > check process pixelServer
>>         >    with pidfile "/etc/sv/pixelServer/supervise/pid"
>>         >    start program = "/usr/bin/sv start pixelServer"
>>         >    stop program = "/usr/bin/sv kill pixelServer"
>>         >    if failed host 127.0.0.1 port 80 protocol http
>>         >    request /pixel
>>         >    then restart
>>         >
>>         >
>>         Hi,
>>         something like below may help.
>>         This ensure that no process is still running before doing a
>>         start.
>>         check process pixelServer with pidfile "/etc/sv/pixelServer/
>>         supervise/pid"
>>            start program = "/usr/bin/killall pixelServer && sleep 2 ; 
>> usr/bin/sv start pixelServer"
>>            stop program = "/usr/bin/sv kill pixelServer"
>>            if failed host 127.0.0.1 port 80 protocol http
>>            request /pixel
>>            then restart
>>
>>
>>
>>         --
>>
>>
>>         --
>>         To unsubscribe:
>>         https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general



reply via email to

[Prev in Thread] Current Thread [Next in Thread]