savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers-public] Re: monitoring


From: Sylvain Beucler
Subject: [Savannah-hackers-public] Re: monitoring
Date: Sun, 9 Apr 2006 19:11:17 +0200
User-agent: Mutt/1.5.11+cvs20060126

> Sorry I didn't respond to this while getting myself on the list.
> 
> > Knowing about what is running or down is not a big issue - users will
> > probably notice it before the monitoring tool and tell us about
> > it.
> 
> Yes, but at least in the past, things often seemed to be down for
> quite a long time with no indication of what was going on, and I
> suspect people typically didn't report problems correctly.  My
> experience is that monitoring can help in preventing problems and
> often in indicating what's actually caused a problem, but I know your
> mileage probably varies.
> 
> Sorry I'm talking without knowing how things actually run.

You're probably right. As I mentioned nobody reported the recent
ViewCVS downtimes.

[Joshua (from the FSF admins team) recently mentioned they do have a
monitoring system, eg detecting that Apache was not replying yesterday
during a small DoS. I suppose it would be difficult for us to have
access there, so redundancy is not an issue :)]


> > It would be interesting, though, to setup some security checks, such
> > as: is the /home directory well ready-only when accessed through the
> > arch sftp service?
> 
> Cfengine can probably help with things like directory permissions and
> cleaning up lock files etc.  I'm not sure about something like
> viewcvs, but cfengine can take action depending on running or
> non-running processes.

Ok, can we work on this kind on monitoring? How do you see things?


> > Is it possible for a project member to commit to
> > CVSROOT/?  etc.
> 
> Yes, I had that sort of thing in mind, but I haven't thought how to do
> it sensibly.

I tried something hand-made. It cannot detect all failures without
human supervision because CVS doesn't return appropriate return codes
sometimes:
http://arch.sv.gnu.org/head/administration/infra/main/0/cvs/cvs-test-suite.sh

Is there's a cleaner way to do it?


> > I'm also concerned about usage statistics. For example, the other day
> > the load went to 20 and I have about no clue what it was due
> > to.
> 
> Cfengine has some support for that sort of thing -- alerts and
> monitoring based on process and resource statistics, but I'm not sure
> it's terribly useful in practice.
> 
> > Mathieu Roy from Gna! told me about heavy SSH robot attacks that
> > could be more lightly rejected using dynamic IP-based restrictions and
> > inetd.
> 
> Don't you do rate-limiting with iptables to combat that?  I did try to
> look at the firewalling, but that needs more privilege.

No we don't, though Steven said he would search a script of his next
week that is supposed to do so :) I saw a couple interesting options
in the iptables manpages (though Sarge doesn't have connlimit) - feel
free to share your knowledge.

Check /var/lib/iptables/active btw.

I think most sensible issue is that we don't actually know if we have
such abuses.


> > Users may also be interested in SCM-related stats.
> 
> What does SCM stand for here and elsewhere?  `Software configuration
> management' or something else?

"VCS" if you prefer the old acronym ;)

I suppose that
http://www.gnu.org/software/gnu-arch/tutorial/Introducing-arch.html
(section 1.4) gives a good definition.

-- 
Sylvain




reply via email to

[Prev in Thread] Current Thread [Next in Thread]