monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[monit-dev] Assertion failure in 4.10.1


From: Brian Candler
Subject: [monit-dev] Assertion failure in 4.10.1
Date: Tue, 25 Mar 2008 15:14:15 +0000
User-agent: Mutt/1.5.11

monit 4.10.1 just died with an assertion failure:

...
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' checksum was changed 
for /etc/pen.d/testapp.conf
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' trying to restart
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' stop: /bin/bash
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' start: /usr/bin/pen
Mar 25 14:54:34 localhost monit[17903]: AssertException: s at xmalloc.c:110 
aborting..

System details:

* CentOS 4.5
Linux localhost.localdomain 2.6.9-55.0.2.plus.c4 #1 Fri Jul 6 05:04:29 EDT
2007 i686 i686 i386 GNU/Linux

* monit 4.10.1 built as an RPM, within a chroot environment (mach) on
another host.

Spec file taken from http://dag.wieers.com/rpm/packages/monit/monit.spec
(just changed 4.9 to 4.10.1)

What I was doing: I had set up a dependency between a config file
(/etc/pen.d/testapp.conf) and a process, then I modified the config file
by adding a blank line, to see if monit would restart the process. It
appears that it started to do so, then died :-(

My full configs are attached below - in particular see
/etc/monit.d/testapp.monitrc

I'm not sure that what I was doing was valid (having a 'restart' action
within a file check, and then a process check dependent on the file check).
So it's possible this is a case of operator error. However I still wouldn't
have expected monit to die.

In case it's relevant, I should add that the checks testapp_mongrel_1 and
testapp_mongrel_2 are intentionally failing, because the processes which
they are trying to start have not yet been installed on the target box. Here
is a fuller log extract:

...
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' process is not 
running
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' trying to restart
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_1' start: 
/usr/bin/mongrel_rails
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' process is not 
running
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' trying to restart
Mar 25 14:54:03 localhost monit[17903]: 'testapp_mongrel_2' start: 
/usr/bin/mongrel_rails
Mar 25 14:54:04 localhost monit[17903]: 'testapp_mongrel_1' failed to start
Mar 25 14:54:04 localhost monit[17903]: 'testapp_mongrel_2' failed to start
Mar 25 14:54:34 localhost monit[17903]: 'testapp_mongrel_1' failed to start
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' checksum was changed 
for /etc/pen.d/testapp.conf
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen.conf' trying to restart
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' stop: /bin/bash
Mar 25 14:54:34 localhost monit[17903]: 'testapp_pen' start: /usr/bin/pen
Mar 25 14:54:34 localhost monit[17903]: AssertException: s at xmalloc.c:110 
aborting..

The bug appears to be repeatable - I tried restarting monit and changing
that config file, and I get the same crash.

Regards,

Brian Candler.


# cat /etc/monit.conf
set daemon 30
set logfile syslog facility log_daemon
set mailserver localhost
set mail-format {from:address@hidden
set alert address@hidden only on { timeout, nonexist }
set httpd port 2812
    allow localhost
    allow X.X.X.0/255.255.252.0
include /etc/monit.d/*

# head -100 /etc/monit.d/*
==> /etc/monit.d/apache.monitrc <==
check process apache
  with pidfile "/var/run/httpd.pid"
  start program = "/etc/init.d/httpd start"
  stop program = "/etc/init.d/httpd stop"
  if 2 restarts within 3 cycles then timeout
  if totalmem > 100 Mb then alert
  if children > 255 for 5 cycles then stop
  if cpu usage > 95% for 3 cycles then restart
  #if failed port 80 protocol http then restart
  group server
  depends on httpd.conf, httpd.conf.d

check file httpd.conf
  with path /etc/httpd/conf/httpd.conf
  # Reload apache if the httpd.conf file was changed
  if changed checksum
    then exec "/etc/init.d/httpd graceful"

check directory httpd.conf.d
  with path /etc/httpd/conf.d
  if changed timestamp
    then exec "/etc/init.d/httpd graceful"

==> /etc/monit.d/memcached.monitrc <==
check process memcached
  with pidfile /var/run/memcached/memcached.pid
  start program = "/etc/init.d/memcached start"
  stop program = "/etc/init.d/memcached stop"
  if cpu is greater than 80% for 4 cycles then restart

==> /etc/monit.d/testapp.monitrc <==
check process testapp_pen
  with pidfile /var/run/pen/testapp.pid
  start program = "/usr/bin/pen -F /etc/pen.d/testapp.conf -u nobody
                  -p /var/run/pen/testapp.pid
                  -C 127.0.0.1:9999 127.0.0.1:10000"
  stop program  = "/bin/bash -c 'kill -s SIGTERM `cat 
/var/run/pen/testapp.pid`'"
  if totalmem is greater than 10.0 MB for 2 cycles then restart
  if cpu is greater than 50% for 2 cycles then restart
  if 2 restarts within 3 cycles then timeout
  depends on testapp_pen.conf
  group testapp

check file testapp_pen.conf
  with path /etc/pen.d/testapp.conf
  if changed checksum
    then restart

check process testapp_mongrel_1
  with pidfile /u/apps/testapp/shared/tmp/pids/mongrel.10001.pid
  start program = "/usr/bin/mongrel_rails cluster::start --clean
                  -C /u/apps/testapp/current/config/mongrel_cluster.yml
                  --only 10001"
  stop program  = "/usr/bin/mongrel_rails cluster::stop
                  -C /u/apps/testapp/current/config/mongrel_cluster.yml
                  --only 10001"
  if totalmem is greater than 110.0 MB for 4 cycles then restart
  if cpu is greater than 80% for 4 cycles then restart
  if 10 restarts within 10 cycles then timeout
  group testapp

check process testapp_mongrel_2
  with pidfile /u/apps/testapp/shared/tmp/pids/mongrel.10002.pid
  start program = "/usr/bin/mongrel_rails cluster::start --clean
                  -C /u/apps/testapp/current/config/mongrel_cluster.yml
                  --only 10002"
  stop program  = "/usr/bin/mongrel_rails cluster::stop
                  -C /u/apps/testapp/current/config/mongrel_cluster.yml
                  --only 10002"
  if totalmem is greater than 110.0 MB for 4 cycles then restart
  if cpu is greater than 80% for 4 cycles then restart
  if 10 restarts within 10 cycles then timeout
  group testapp





reply via email to

[Prev in Thread] Current Thread [Next in Thread]