Greetings. I'm using monit to check a device on my
local lan to ensure it's responsive. This is what I was
previously using:
check host resolver2
address 192.168.1.13
every 5 cycles
if failed ping4
then exec /usr/local/bin/pushover
repeat every 2 cycles
At random in the middle of the night, I'd occasionally
get an alert that ping failed. Device was up and
responsive. Modified it to:
check host resolver2
address 192.168.1.13
every 5 cycles
if failed ping4 count 5 timeout 5 seconds
then exec /usr/local/bin/pushover
repeat every 2 cycles
Which seemed to work for a while, but the failures
returned. So modified it to this:
check host resolver2
address 192.168.1.13
every 5 cycles
if failed ping4 count 5 responsetime 200
milliseconds timeout 10 seconds
then exec /usr/local/bin/pushover
repeat every 2 cycles
However, this causes an immediate failure with 'Speed
failed'. I increased responsetime to 1 second, same.
Increased to 10 seconds, same. The specific error logged
is
NovĀ 5 09:52:45
ntpviz monit[676]: 'resolver2' response time 6.173
ms doesn't match limit [time = 10 s]
I may be misunderstanding the difference - or the
interaction - between responsetime and timeout. The
device is just a raspberry pi zero, with a typical
responsetime under 10ms, and never under appreciable
load.
The manual states
The RESPONSETIME
parameter sets the response time limit
Is that the response time limit for any given ping, or
if for example I use a count of five, is it the time
limit for all five attempts accumulated? Even if it were
the latter, since the device is reachable in under 10ms,
a count of five - assuming 1 second between each ping
sent (is that a correct assumption, or are they sent
without pause?) - would still complete in well under 10
seconds.
For reference, here's the results of a 60 second ping
from the monit server to the device:
root@ntpVIZ:
/etc/monit/conf.d # ping -c 60 192.168.1.13
PING 192.168.1.13 (192.168.1.13) 56(84) bytes of
data.
[elided]
--- 192.168.1.13 ping statistics ---
60 packets transmitted, 60 received, 0% packet loss,
time 59094ms
rtt min/avg/max/mdev = 2.342/6.376/13.879/2.335 ms
Thanks for any insight.
--
Paul Theodoropoulos
www.anastrophe.com