|
From: | Paul Theodoropoulos |
Subject: | Baffling status failure to alert |
Date: | Fri, 11 Mar 2016 16:01:15 -0800 |
User-agent: | Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.0 |
I'm stumped. I have an ugly little script to
alert me if today's backup of a database is smaller than the one
from yesterday (and the day before). The script works properly,
and I have a simple monit rule in place to alert me if it fails.
When monit checks, it reports a failure; that is pushed up to my
m/monit server, which also logs the failure. From there, all
alerts go to PagerDuty. But I never get alerts from this check. (Hopefully) all relevant output is below. Some strings have been obfuscated. Note that I have the rule modified to falsely report a failure, for testing. address@hidden: /etc/monit/conf.d # cat /etc/debian_version 7.9 address@hidden: /etc/monit/conf.d # monit --version This is Monit version 5.17 Built with ssl, without pam and with large files Copyright (C) 2001-2016 Tildeslash Ltd. All Rights Reserved. address@hidden: /etc/monit/conf.d # cat backups check program backup_failure with path /usr/local/bin/check_backup with timeout 15 seconds not every "* 14 * * *" #if status != 0 then alert if status != 1 then alert address@hidden: /etc/monit/conf.d # cat /usr/local/bin/check_backup #!/bin/bash BACKUP_DIR=/var/backups cd ${BACKUP_DIR} BUFILE=`date +%Y_%m_%d`_"group".sql.gz YDAY_BUFILE=`date --date "1 days ago" +%Y_%m_%d`_"group".sql.gz DAYBEFORE_YDAY_BUFILE=`date --date "2 days ago" +%Y_%m_%d`_"group".sql.gz if [ -e "${BUFILE}" ];then TDAYSIZE=`du ${BUFILE}|cut -f1` YDAYSIZE=`du ${YDAY_BUFILE}|cut -f1` DBDAYSIZE=`du ${DAYBEFORE_YDAY_BUFILE}|cut -f1` if [ $YDAYSIZE -gt $DBDAYSIZE ];then if [ $TDAYSIZE -gt $YDAYSIZE ];then exit 0 fi else exit 1 fi fi address@hidden:/etc/monit/conf.d # tail -1 /var/log/daemon.log Mar 11 15:25:04 localhost monit[10562]: 'backup_failure' '/usr/local/bin/check_backup' failed with exit status (0) -- no output address@hidden: ~ # monit status|tail -7 Program 'backup_failure' status Status failed monitoring status Monitored last started Fri, 11 Mar 2016 15:42:36 last exit value 0 data collected Fri, 11 Mar 2016 15:42:36 What am I missing? -- Paul Theodoropoulos www.anastrophe.com |
[Prev in Thread] | Current Thread | [Next in Thread] |