Hi,
please try to run Monit in debug mode:
1.) stop monit
2.) start it in debug mode:
monit -v
3.) try to start the service:
monit start wbm003
4.) collect output from the logfile
Regards,
Martin
On 04 May 2015, at 18:55, johng <address@hidden> wrote:
hi all,
so we are running monit version:
/usr/bin/monit -V
This is Monit version 5.12.2
Copyright (C) 2001-2015 Tildeslash Ltd. All Rights Reserved.
on:
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.6 (Tikanga)
and it is pretty cool! it is allowing us to monitor the application health of
some java tomcat applications we have through calling some curl checks we have
in a bash script.
what is not quite working correctly is the start aspect of the program. sometimes it
starts, sometimes it doesn't, sometimes it brings up java/tomcat but certain aspects of
the application don't work. quite puzzling... i have been looking at logs, writing
wrapper scripts and googling extensively but have not come to a definitive reason as to
why the start program only works when i preface it with a wrapper script. any light you
guys could shed on this would be most welcome! oh, also of note is that the
"stop" program facet of the monitrc file always works. i can see the
java/tomcat processes disappear from the process list but it is the start that bombs.
our monitrc files look like so:
cat /etc/monitrc
# /etc/monit.conf
#
set daemon 60
set httpd port 2812
allow blah:bleh
allow localhost
allow xxx.xxx.xxx.xxx/24
set logfile /opt/company/log/monit/adm001
include /opt/company/etc/monit/*.monitrc
-------------------------------------
then the include file looks like so:
check process wbm003 with pidfile /var/run/jsvc1.pid
start program "/usr/bin/sudo /fusionone/bin/monit_wrapper.sh /fusionone/bin/f1 -v
start" with timeout 170 seconds
stop program "/usr/bin/sudo /fusionone/bin/monit_wrapper.sh /fusionone/bin/f1 -v
stop" with timeout 170 seconds
if not exist for 3 cycles then alert
mode passive
check program wbm003_program with path "/fusionone/bin/check_wbm_cua.sh" with
timeout 10 seconds
if status != 0 then alert
--------------------------------------
the wrapper script is just this:
cat /fusionone/bin/monit_wrapper.sh
#!/bin/sh
{
echo "MONIT-WRAPPER date"
date
echo "MONIT-WRAPPER env"
env
echo "MONIT-WRAPPER $@"
$@
R=$?
echo "MONIT-WRAPPER exit code $R"
echo "MONIT-WRAPPER --------------------------------------------"
} >>/tmp/monit.log 2>&1
---------------------------------------
but what i don't understand is if i get rid of the wrapper script in the start
program, the program does not start or only partially starts.
so this doesn't work:
start program "/usr/bin/sudo /fusionone/bin/f1 -v start" with timeout 170
seconds
but this does:
start program "/usr/bin/sudo /fusionone/bin/monit_wrapper.sh /fusionone/bin/f1 -v
start" with timeout 170 seconds
-----------------------------------------
thank you,
baffled in seattle.
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general