[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [monit] "failed to stop" messages
From: |
Jan-Henrik Haukeland |
Subject: |
Re: [monit] "failed to stop" messages |
Date: |
Sat, 20 Feb 2010 19:39:28 +0100 |
The stop (and start) command must be a program or script. Monit does not start
an interactive shell to execute the command, but calls exec. To write a
shell-script as the command you must therefor first start a shell.
If you change your stop command to start a shell and then enter your "script"
it will hopefully work:
stop program = "/bin/bash -c '/usr/bin/kill -9 `cat
/shared/pids/thin.8007.pid` && rm -f /shared/pids/thin.8007.pid'"
On 20. feb. 2010, at 18.57, Dylan Stamat wrote:
> Hello!
>
> I'm using Monit to monitor some processes, and can't seem to get my simple
> configuration working correctly.
> When my threshold is met, I end up getting sent constant "failed to stop"
> messages.
>
> Here is the output in my logs:
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
> monit[4823]: 'thin8007' total mem amount of 205988kB matches resource limit
> [total mem amount>163840kB]
> monit[4823]: 'thin8007' trying to restart
> monit[4823]: 'thin8007' stop: /usr/bin/kill
> monit[4823]: 'thin8007' failed to stop
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Here is my configuration:
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
> set daemon 20
> set logfile syslog facility log_daemon
> check process thin8007 with pidfile /shared/pids/thin.8007.pid
> start program = "/usr/bin/thin start -C /etc/thin/application.yml --only
> 8000"
> stop program = "/usr/bin/kill -9 `cat /shared/pids/thin.8007.pid` && rm -f
> /shared/pids/thin.8007.pid"
> if totalmem > 160.0 MB for 1 cycles then restart
> if cpu > 90% for 1 cycles then restart
> group thin
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
> As you can see, the "stop" directive is a bit of a brute force method. Prior
> to using that, I was using the "stop" command
> of the application (thin) I'm trying to monitor. I ran into a problem when
> the application wouldn't clean up after itself, and
> it would end up leaving stale pid files around. So, I decided to SIGKILL the
> process and clean up the pid manually.
>
> If I run the stop command manually, the process is killed and the pid file is
> gone. However, when it is run through Monit, I
> get the "failed to stop" message. Monit is run as root on this system, but,
> it still seems like it could be a permissions issue?
> Is there anyway to get more verbose output in regard to why it "failed to
> stop"? Is there anything that Monit could glean from
> the output of the system calls it makes? I'd be happy to patch if that was a
> possibility!
>
> Any suggestions would be welcome!
> Thanks!
> ==
> Dylan
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general