[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Example monit configuration using M4
From: |
Vlada Macek |
Subject: |
Example monit configuration using M4 |
Date: |
Wed, 10 Nov 2004 11:31:32 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040914 |
On the monit mailing list, some users are quite often asking how to
semi-automatically customize the monit configuration for multiple
servers (e.g. in cluster).
They are advised to use several approaches. For example to use scripting
or high-level languages for picking up the right configuration for the
servers.
Several months ago I decided to try what I call The Right Tool For The
Job, the text processor M4 for this task. It appeared to be a success at
my side, I'm successfully using it. The side effect of this way is that
(thanks to M4 macros), my source configuration is looks brief and
readable, though I monitor many resources.
I'd like to present my results to the community for the case anyone is
interested and would need some kick-start. I also include the external
checking script. I played with all of this for an amount of time, some
useful tips may be found inside.
I hope this will help someone. Please let me know. I'd be delighted if
monit developers decide to put my contribution on the web or to the package.
(The included README contains a bit more info. In case the attachments
will get modified badly, please let me know. I'll publish the files
another way.)
Have a nice day!
--
\//\/\
(Sometimes credited as 1494 F8DD 6379 4CD7 E7E3 1FC9 D750 4243 1F05 9424.)
Monit configuration example using M4
====================================
Vlada Macek <tuttle a bbs d cvut d cz>
November 10, 2004
On the monit mailing list, some users are quite often asking how to
semi-automatically customize the monit configuration for multiple servers
(e.g. in cluster).
They are advised to use several approaches. For example to use scripting or
high-level languages for picking up the right configuration for the servers.
Several months ago I decided to try what I call The Right Tool For The Job,
the text processor M4 for this task. It appeared to be a success at my side,
I'm successfully using it. The side effect of this way is that (thanks to M4
macros), my source configuration is looks brief and readable, though I
monitor many resources.
I'd like to present my results to the community for the case anyone is
interested and would need some kick-start. This README will not discuss the
details, you'll need to dig in to the files to get the clue.
I also include the external checking script. I played with all of this for
an amount of time, some useful tips may be found inside.
Note: This is all developed on Linux. There are considerable differences in
M4 processor on some other platforms. Don't forget to open your M4 manual
(it's an interesting reading anyway).
Note2: The identifiable information here is fake.
I hope this will help someone. Please let me know.
-- Overview: --------------------------------------------------------------
compile.sh:
This is a M4 compilation shortcut. Also a good moment to backup all
tested files to a specific directory. It came in handy several times for
me to be able to peek at the previous versions and saw what have
changed. SHA1 sums of all files contained in the tarball is also
computed and stored in the list for a later quick reference.
monitrc.m4:
Main stub configuration file. Contains common definitions for all
servers. Finally, M4 includes particular source file by the servername
(e.g. given in compile.sh) and includes its source file.
macros.m4:
We define some M4 macros here. See comments inside for more information.
These macros serve as shorthanding the monit configuration. Monit config
syntax is normally nicely verbose, but when lots of tested resources are
involved, the admin may welcome concise form such the one provided here.
dilbert.m4:
When you run ./compile.sh on machine called `dilbert', this file will
get included. As you may notice, this example serves a bit like an
Intrusion Detection System. I know there are advanced and more competent
IDSes available, but this is a temporary setup inherited from someone
else until I engage some real IDS. Just wanted to show the M4.
monitorizefile.sh:
Script for quick including some file to the configuration. Running this:
./monitorizefile.sh freefile /etc/passwd
something like this is printed to stdout:
check file passwd_file with path /etc/passwd
PUGHTS(, 644, root, root,
edc54f79d58ff03a195d30e77ecc9ca452fc3721, VAR)
group freefile
# depends on passwd_file
status4monit.sh:
The M4 config described so far has the advantage, that everything is
pre-compiled into one monitrc that matters and could be checked for
changes. Just for the curious I include my "external" script that is run
with a short period to monitor monit and several other stuff that monit
itself is not able to check.
-----------------------------------------------------------------------------
#!/bin/bash
#
# $Id: compile.sh,v 1.3 2004/09/14 14:44:42 cvs Exp $
HOST="`uname -n`"
DOMAIN="hisoffice.cz"
BACKSAFE_PATH="/var/local/backsafe"
BACKSAFE_FILE="backsafe-`uname -n`-`date +'%Y%m%d-%H%M%S'`"
echo
echo "Compiling the configuration to the 'output' file..."
m4 monitrc.m4 -DTHISHOST="$HOST" -DTHISDOMAIN="$DOMAIN" > output
echo
echo "Making backsafe to $BACKSAFE_PATH..."
cd "$BACKSAFE_PATH" || {
echo "Error chdir to $BACKSAFE_PATH"
exit 1
}
grep 'check file' /etc/monit.d/output | grep -v '^#' | cut -d' ' -f6 >
backsafe_list
PL="`grep ^/proc/ backsafe_list`"
[ -n "$PL" ] && {
cp -L --parents $PL $BACKSAFE_PATH
}
nice tar clvf "${BACKSAFE_FILE}.tar" -T backsafe_list --exclude /proc
nice tar rlvf "${BACKSAFE_FILE}.tar" proc
rm -rf proc
echo
echo "Computing sums..."
cat backsafe_list | while read F; do
sha1sum "$F" >> "${BACKSAFE_FILE}.sums"
done
echo
echo "Updating perms..."
chown -R root:root "$BACKSAFE_PATH"
chmod go= "$BACKSAFE_PATH"
echo
echo "Compressing backsafe on the background..."
nohup nice bzip2 -9 "${BACKSAFE_FILE}.tar" &
sleep 1
# Control file for monit
#
# M4 input CVS version: $Id: monitrc.m4,v 1.5 2004/07/26 13:05:00 cvs Exp $
dnl This needs to be processed by m4 macro processor. Warning: There
dnl are considerable incompatibilities in M4 between platforms.
dnl
dnl Quick intro to m4:
dnl Quoted string (`string') is fixed and unprocessed.
dnl `divert(-1)' disables the output until `divert' is called.
dnl But macros are still processed in contrast to `#' comments.
dnl `dnl' reads and discards all characters, up to and
dnl including the first newline. See m4 man of info pages.
dnl
dnl Let's include some macros.
include(`macros.m4')dnl
dnl The following warning should make it to the distilled files
dnl and warn user there. Does not state truth here.
### Warning: This is GENERATED FILE! Do NOT EDIT! Your changes may be lost.
### Edit M4 originals instead!
dnl ---
#############################################################################
# Common settings
#
set daemon 300 # Poll each n seconds
set logfile /var/log/monit # Write log to own file
set statefile /var/run/monit.state # Unified place for the state file
# (checking its timestamp from cron).
set mail-format {
from: address@hidden
subject: $HOST alert: $EVENT $SERVICE, $ACTION, $DATE
message:
Generic monit alert message.
}
set alert address@hidden # Send an email whenever an event occurred on
any service
set mailserver ... # Send to the second if the first is unavailable
set httpd port 81 and use address THISHOST.THISDOMAIN
allow localhost # Allow localhost to connect
allow ... # Allow Basic Auth
allow ... read-only # Low privilege account
signature disable # Don't be too verbose
allow THISHOST.THISDOMAIN
#############################################################################
# Common filesystems
#
check device root_dev with path /bin
DEVICEUSAGE(94)
check device var_dev with path /var/cache
DEVICEUSAGE(94)
#############################################################################
# Common directories
#
check directory temp_dir with path /tmp
PUGHTS(alert, 1777, root, root)
check directory roothome_dir with path /root
PUGHTS(alert, 700, root, root)
check directory etc_dir with path /etc
PUGHTS(alert, 755, root, root, , VAR)
#############################################################################
# Included machine dependent settings follow
#
include(THISHOST`.m4')
divert(-1)
We expect two macros to be set from the command line:
m4 -DTHISHOST=dilbert -DTHISDOMAIN=hisoffice.org
Now we will undefine some words that accidentally might be expanded as macros.
undefine(`format')
We're about to define the following macros now:
DEVICEUSAGE(percent-to-alert, percent-to-stop)
PUGHTS([action], [perm], [uid], [gid], [checksum|VAR],
[timestamp expression|VAR], [size expression|VAR])
ICMPALERT([timeout in seconds])
TOOMANYRESTARTS (simple expand)
Sanity checks:
ifdef(`THISHOST', , `errprint(`Fatal: THISHOST macro must be set.
')m4exit(1)')
ifdef(`THISDOMAIN', , `errprint(`Fatal: THISDOMAIN macro must be set.
')m4exit(1)')
And now for the control macros:
define(`DEVICEUSAGE', `dnl
ifelse($1, `', , `if space usage > $1 % then alert
if inode usage > $1 % then alert')
ifelse($2, `', , `if space usage > $2 % then stop
if inode usage > $2 % then stop')
')
define(`PUGHTS', `define(`ACTOMIT', ifelse($1, `', `unmonitor', `$1'))dnl
ifelse($2, `', , `if failed perm $2 then ACTOMIT')
ifelse($3, `', , `if failed uid $3 then ACTOMIT')
ifelse($4, `', , `if failed gid $4 then ACTOMIT')
ifelse($5, `VAR', `if changed checksum then ACTOMIT',
$5, `', , `if failed checksum expect $5 then ACTOMIT')
ifelse($6, `VAR', `if changed timestamp then ACTOMIT',
$6, `', , `if timestamp $6 then ACTOMIT')
ifelse($7, `VAR', `if changed size then ACTOMIT',
$7, `', , `if size $7 then ACTOMIT')`'dnl
')
define(`ICMPALERT', `if failed icmp type echo with timeout ifelse($1, `', `15
seconds', `$1') then alert')
define(`TOOMANYRESTARTS', `if 3 restarts within 5 cycles then timeout')
divert
# Monit configuration subfile for particular server
# It is included by other files.
#
# M4 input CVS version: $Id: dilbert.m4,v 1.18 2004/11/10 09:02:07 cvs Exp $
#############################################################################
# Local files
#
check file aliases_file with path /etc/aliases
PUGHTS(alert, 644, root, root,
e1befd537dfb6a947b39a134504bb62756a425f4, VAR)
group freefile
check file aliases_db_file with path /etc/aliases.db
PUGHTS(alert, 644, root, root, , VAR)
group freefile
check file status4monit_file with path /var/run/status4monit
if timestamp > 10 minutes then alert # File needs to be updated
every 5 mins!
PUGHTS(alert, 600, root, root)
group freefile
check file status4monit_bin with path /root/crontab_scripts/status4monit.sh
PUGHTS(, 700, root, root, , VAR)
group freefile
check file raid_status_file with path /proc/mdstat
if failed checksum expect a3d258a5f75c36507e423923dbed425893ba1a43 then
alert
group freefile
check file proc_modules_file with path /proc/modules
if size != 0 B then alert
group freefile
check file proc_partitions_file with path /proc/partitions
if failed checksum expect ac492a6c0faaa42fb47ed655d24332634b2234d0 then
alert
group freefile
check file inittab_file with path /etc/inittab
PUGHTS(alert, 644, root, root,
f216834bf4fbc3bbbd2b7ef8c219ab341fe18582, VAR)
group freefile
check file passwd_file with path /etc/passwd
PUGHTS(alert, 644, root, root,
1aa62f378c212ae43c0ddf0503bf1fb78515328a, VAR)
group freefile
.
.
.
#############################################################################
# Local binaries
#
check file login_bin with path /bin/login
PUGHTS(alert, 755, root, root,
0a74fa4b1dc40ee29284f55798b56ca67a9a2c3f, VAR)
group freebin
check file ps_bin with path /bin/ps
PUGHTS(alert, 555, root, root,
ffdaf0d89e62825c19e0d71da4b8c9dcbc226c24, VAR)
group freebin
check file sendmail_bin with path /usr/sbin/sendmail.sendmail
PUGHTS(alert, 4555, root, root,
558667a1bfa22f6221d316c54e8826b19a8afec1, VAR)
group freebin
check file ifconfig_bin with path /sbin/ifconfig
PUGHTS(alert, 755, root, root,
5d4d2426a10af5a5a96883155e58e6be04791562, VAR)
group freebin
check file netstat_bin with path /bin/netstat
PUGHTS(alert, 755, root, root,
413bda08012ba9b388586ec5d4e2b48e496e0203, VAR)
group freebin
.
.
.
#############################################################################
# Local services + control files
#
check process apache_proc with pidfile /var/run/httpd.pid
start program = "/etc/init.d/httpd start"
stop program = "/etc/init.d/httpd stop"
depends on httpd_bin, httpd_conf_file
group apache
TOOMANYRESTARTS
if failed host www.THISDOMAIN port 80 protocol http and request "/"
then restart
if cpu > 95% for 5 cycles then restart
if children > 20 then restart
if loadavg(5min) > 10 for 8 cycles then stop
check file httpd_bin with path /usr/sbin/httpd
PUGHTS(, 755, root, root, 6726233a9fdaf68e702b5796bef31e2c30339994, VAR)
group apache
check file httpd_conf_file with path /etc/httpd/conf/httpd.conf
PUGHTS(, 644, root, root, 91ebaf22e5f055e4b211af7cd45e1cbc5289d065, VAR)
group apache
#############################################################################
check process xinetd_proc with pidfile /var/run/xinetd.pid
start program = "/etc/init.d/xinetd start"
stop program = "/etc/init.d/xinetd stop"
depends on xinetd_bin, ipop3d_bin, proftpd_bin, proftpd_conf_file,
proftpdpasswd_file
group xinetd
TOOMANYRESTARTS
if failed host THISHOST.THISDOMAIN port 110 type TCP protocol POP then
restart
if failed host THISHOST.THISDOMAIN port 21 type TCP protocol FTP then
restart
check file xinetd_bin with path /usr/sbin/xinetd
PUGHTS(, 755, root, root, fd134937b2ca3610e5aea62b7f02f055b33f2d3a, VAR)
group xinetd
check file ipop3d_bin with path /usr/sbin/ipop3d
PUGHTS(, 755, root, root, 4dcaeebdd1337510c04439c809970be0ca654889, VAR)
group xinetd
check file proftpd_bin with path /usr/sbin/in.proftpd
PUGHTS(, 755, root, root, a171409a1be6da77aaff3874c7c9d443edca9e94, VAR)
group xinetd
check file proftpd_conf_file with path /etc/proftpd.conf
PUGHTS(, 600, root, root, 6c6ccd2128105ef54d503ea0316efde4e990304f, VAR)
group xinetd
check file proftpdpasswd_file with path /etc/proftpdpasswd
PUGHTS(, 640, root, visitors, 8a2458ca08658ed1adfd816e98523a9a39bdef63,
VAR)
group xinetd
#############################################################################
check process mailscanner_proc with pidfile /var/run/MailScanner.pid
start program = "/etc/init.d/MailScanner start"
stop program = "/etc/init.d/MailScanner stop"
depends on sendmail_bin, perl_bin, sendmail_cf_file,
mailscanner_conf_file
group email
TOOMANYRESTARTS
if failed host THISHOST.THISDOMAIN port 25 type TCP protocol SMTP then
restart
if cpu > 95% for 5 cycles then restart
check file perl_bin with path /usr/bin/perl
PUGHTS(, 755, root, root, 85f6bad370ad6cceeb51a40399d87cd6f24d8391, VAR)
group email
check file sendmail_cf_file with path /etc/sendmail.cf
PUGHTS(, 640, root, root, bd188d4f345b68c36bf821d8eb614332ffeb9bec, VAR)
group email
check file mailscanner_conf_file with path /etc/MailScanner/MailScanner.conf
PUGHTS(, 640, root, root, d410b88ced3521ecac1e8a461f8239bd60690332, VAR)
group email
#############################################################################
check process sshd_proc with pidfile /var/run/sshd.pid
start program = "/etc/init.d/sshd start"
stop program = "/etc/init.d/sshd stop"
TOOMANYRESTARTS
depends on sshd_bin, ssh_bin, ssh_config_file, ssh_host_dsa_key_file,
ssh_host_dsa_key_pub_file, ssh_host_key_file, ssh_host_key_pub_file,
ssh_host_rsa_key_file, ssh_host_rsa_key_pub_file, sshd_config_file
group sshd
if failed port 22 type TCP protocol ssh then restart
check file sshd_bin with path /usr/sbin/sshd
PUGHTS(, 755, root, root, 1135895ae075112d42bb271e50330ff04c4742ef, VAR)
group sshd
check file ssh_bin with path /usr/bin/ssh
PUGHTS(, 4755, root, root, 291eb820b1ea75d6c1cdeea02eacd7e8e41deb9e,
VAR)
group sshd
check file ssh_config_file with path /etc/ssh/ssh_config
PUGHTS(, 644, root, root, bcbcb85006b35e30dcabef1eefd2c8471802574b, VAR)
group sshd
check file sshd_config_file with path /etc/ssh/sshd_config
PUGHTS(, 644, root, root, e8e9f85111f86c71b2a88fb1ada4ef61c2eaaa58, VAR)
group sshd
check file ssh_host_dsa_key_file with path /etc/ssh/ssh_host_dsa_key
PUGHTS(, 600, root, root, 5e31e4afae286dca19d4918729c9b3a2947a3e0a, VAR)
group sshd
check file ssh_host_dsa_key_pub_file with path /etc/ssh/ssh_host_dsa_key.pub
PUGHTS(, 644, root, root, ba88309e4e92dc9fc40d9742a4d87dbd4d88aab0, VAR)
group sshd
check file ssh_host_key_file with path /etc/ssh/ssh_host_key
PUGHTS(, 600, root, root, dcdbfb9aa88ee0b0e1d819e009caa9a2fb856589, VAR)
group sshd
check file ssh_host_key_pub_file with path /etc/ssh/ssh_host_key.pub
PUGHTS(, 644, root, root, 62df722e4158b57b97b37feb8c5b80ee2e4244c6, VAR)
group sshd
check file ssh_host_rsa_key_file with path /etc/ssh/ssh_host_rsa_key
PUGHTS(, 600, root, root, 7cf4d88259e1c464592da6d13ca448dbe74c9d1d, VAR)
group sshd
check file ssh_host_rsa_key_pub_file with path /etc/ssh/ssh_host_rsa_key.pub
PUGHTS(, 644, root, root, 4872514feea52e208a09c92cb91a9c188eeec320, VAR)
group sshd
.
.
.
#############################################################################
# Remote Zope instance
#
check host zope_http with address intranet.hisoffice.cz
if failed port 8080 type TCP protocol HTTP then alert
#############################################################################
# Ping other servers
#
check host dogbert_ping with address 1.2.3.4
ICMPALERT
check host firewall_ping with address 4.3.2.1
ICMPALERT(25 seconds)
.
.
.
#!/bin/bash
#
# Example script prints the file checking definition
# for monit ctl file (http://www.tildeslash.com/monit).
#
# $Id: monitorizefile.sh,v 1.1 2004/07/16 21:15:21 cvs Exp $
if [ "$#" -lt 2 ]; then
echo "Usage: ${0##*/} monitgroup filename [filename [...]]"
exit 1
fi
# Save the group name
GR="$1"
shift
ALLFLS=""
while [ -n "$1" ]; do
# Check whether we got absolute or relative path.
if [ "${1:0:1}" == "/" ]; then
FILENAME="$1"
else
FILENAME="$PWD"/"$1"
fi
if [ ! -f "$FILENAME" ]; then
echo "${0##*/}: File $FILENAME does not exist or is not
regular."
shift
continue
fi
# Compute a hash.
HASH="`sha1sum $FILENAME 2>/dev/null | cut -b1-40`"
# Aletrnative computation.
if [ -z "$HASH" ]; then
HASH="`monit -H $FILENAME | head -1 | cut -d' ' -f3`"
if [ -z "$HASH" ]; then
echo "${0##*/}: Cannot compute hash of file $FILENAME."
exit 1
fi
fi
# Trim the filename.
JUSTNAME="${1##*/}"
JUSTNAME="${JUSTNAME//./_}"
JUSTNAME="${JUSTNAME//-/_}"
# Get other file properties.
PROPS="`find "$FILENAME" -printf '%m, %u, %g'`"
# Output.
echo "check file ${JUSTNAME}_file with path $FILENAME"
echo -e '\tPUGHTS(, '"$PROPS"', '"$HASH"', VAR)'
echo -e '\tgroup '"$GR"
echo
if [ -n "$ALLFLS" ]; then
ALLFLS="$ALLFLS, ${JUSTNAME}_file"
else
ALLFLS="${JUSTNAME}_file"
fi
shift
done
echo "# depends on $ALLFLS"
#!/bin/bash
#
# $Id: status4monit.sh,v 1.16 2004/11/10 09:03:33 cvs Exp $
GOOD_MONITRC="458482da2f12302a4227b6a31669dd6b2d8e25c3"
GOOD_IPTABLES="da39a3ee5e6b4b0d3255bfef95601890afd80709"
OUTFILE="/var/run/status4monit"
HASHFILE="$OUTFILE".hash
# Redirect both stdout and stderr to the file.
exec &> "$OUTFILE"
BAD=""
MQL="`mailq | wc -l`"
if [ "$MQL" -lt "1" ]; then
echo "Less than 1 lines of mail queue list!"
elif [ "$MQL" -gt "30" ]; then
echo "The queue of undelivered mail messages is too long ($MQL lines)!"
echo
mailq
BAD="${BAD}M"
else
echo "OK: Number of lines in mail queue list is between 1 and 30."
fi
echo "---"
UPT="`cat /proc/uptime | cut -d. -f1`"
if [ "$UPT" -ge 900 ]; then
echo "OK: Kernel uptime is above 15 minutes."
else
echo "Kernel uptime is less than 15 minutes! Server recently rebooted."
echo
uptime
BAD="${BAD}U"
fi
echo "---"
IPT="`/sbin/iptables-save | grep -v ^# | sed -e 's/\[[^]]*\]//g' | sha1sum |
cut -b-40`"
if [ "$IPT" = "$GOOD_IPTABLES" ]; then
echo "OK: iptables rules hash okay."
else
echo "Iptables hash failed."
echo
echo "$IPT, should be: $GOOD_IPTABLES"
echo
/sbin/iptables-save
BAD="${BAD}I"
fi
echo "---"
MST="/var/run/monit.state"
MAXAGE=15
#if [ -f "$MST" ]; then
MST2="`find $MST -mmin -$MAXAGE`"
if [ "$MST" == "$MST2" ]; then
echo "OK: According to $MST file, the monit process is alive."
else
echo "Warning: $MST file is older than $MAXAGE minutes or does not
exist. File info:"
echo
ls -la --full-time "$MST"
BAD="${BAD}S"
fi
echo "---"
HAS="`sha1sum /etc/monitrc | cut -b-40`"
if [ "$HAS" = "$GOOD_MONITRC" ]; then
echo "OK: /etc/monitrc hash okay."
else
echo "Hash of /etc/monitrc failed."
echo
echo "$HAS, should be: $GOOD_MONITRC"
echo
cat /etc/monitrc
BAD="${BAD}R"
fi
echo "---"
NET="`netstat -naeeeop`"
LIN="`echo \"$NET\" | wc -l`"
if [ "$LIN" -lt 2 -o "$LIN" -gt 500 ]; then
echo "Netstat output is not between allowed limits."
echo
echo "$NET"
BAD="${BAD}N"
else
echo "OK: Netstat output lines number okay."
fi
echo "---"
# Restrict the access to the file.
chown root:root "$OUTFILE" "$HASHFILE"
chmod 0600 "$OUTFILE" "$HASHFILE"
if [ -n "$BAD" ]; then
SUBJECT="address@hidden -n`: $BAD"
else
SUBJECT="address@hidden -n`: ok"
fi
HAS="`sha1sum $OUTFILE | cut -b-40`"
PRE="`cat $HASHFILE`"
if [ "$HAS" != "$PRE" ]; then
cat "$OUTFILE" | mail address@hidden -s "$SUBJECT"
echo "$HAS" > "$HASHFILE"
fi
signature.asc
Description: OpenPGP digital signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Example monit configuration using M4,
Vlada Macek <=