[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Debugging remote connection failures
From: |
Jan-Henrik Haukeland |
Subject: |
Re: Debugging remote connection failures |
Date: |
Sat, 20 Sep 2014 00:32:49 +0200 |
Hi
Network tests does fail from time to time. It could be anything from an
overworked server to traffic spikes on the network. Usually there are no real
problems, just that Monit was not able to connect within 5 sec by default. This
happens in real life also, but Browsers for instance, will retry and also open
several connections at once so it is not very noticeable.
These alerts, while real, are borderline in the false positive category,
because sooner or later with continuous testing there will be a network or
server hiccup which happen at the time Monit tries to connect. What you usually
want is to ignore these incidents, but instead get an alert if the server
really is down for a "significant" period.
This is why the "for x cycles" statement is so useful and highly recommended,
especially for network testing. I see that you already is using "for x within
Y", but I would just simplify this to something like,
check host example.com with address www.example.com
if failed port 80 protocol http for 3 cycles then alert
if failed port 587 protocol smtp for 3 cycles then alert
How many cycles you should use is a tuning questions and also related to how
often Monit runs. At least 2, possible more if Monit runs several times per
minute.
Running Monit with -Iv is mostly for debugging and not recommended in
production as the output is very verbose and usually not very interesting.
Simply run Monit in the background without any parameters is recommended. If an
error occurs Monit will write this to its log-file so you wont miss out on the
important stuff.
On 18 Sep 2014, at 21:41, David Kozinn, K2DBK <address@hidden> wrote:
> New Monit user here, I'm really just kind of kicking the tires.
>
> I've got a several things that I'm monitoring on a small server that I have,
> but I've also get it set up to monitor services on another box. The relevant
> portion of monitrc looks like this:
>
> check host example.com with address www.example.com
> if failed port 80 protocol http 3 times within 5 cycles then alert
> if failed port 587 protocol smtp then alert
>
> The vast majority of the time this works just fine. However, periodically
> I'll get a failure on one (or very occasionally on both) of these tests,
> which clear up on the next test cycle (60 seconds later). A few times I've
> been connected to the machine running monit and as soon as I get the failure,
> I'll try to manually telnet to the other machine on the appropriate port and
> it's always worked. I'm trying to figure out why it's failing.
>
> The problem is that this doesn't happen terribly frequently, so I'm thinking
> that just running with -Iv might not be practical, since I'd get tons of
> output. (And to be honest, I'm not quite sure if I'd even see anything there.)
>
> Can anyone suggest the best way to figure out why these tests are actually
> failing? Maybe run with verbose mode then tail & filter the output? (Filter
> for what?)
>
> Thanks.
>
> David
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general