cfengine-develop
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cfengine-develop] Development plan / meeting


From: Luke A. Kanies
Subject: Re: [Cfengine-develop] Development plan / meeting
Date: Sat, 1 Mar 2003 13:02:20 -0600 (CST)

On Sat, 1 Mar 2003, Hugo Gayosso wrote:

> I also have some thoughts regarding either 'cfenvd' or a different
> monitoring mechanism or both.
>
> ** Improve the monitoring capabilities
>
> Expand the range of parameters that cfenvd monitors, and an easy way
> to configure/enable/disable the monitoring of each parameter.

I always prefer lightweight services that support plugins.

> Or, by creating a set of "plug-ins" that are run by 'cfagent' or
> 'cfexecd' (e.g. check_memory, check_swap, etc.).
>
> In other words I would like to get rid of the need for a central
> monitoring software (e.g. Netsaint, now Nagios) to check the status of
> some services/hosts.
[SNIP]
> *** Are you ok? (heartbeat)
[SNIP]

I've been thinking about this a lot, and playing a bit with POE, which is
kind of a kernel written in perl. It is basically set up to pass events
around, and can also be used to pass them among machines.  I set up a
simple, example system which had a server listening somewhere, a script
polling my hosts, and then any number of listeners attached to the server.
The only listener I actually developed was an irc bot, and if a server
changed state (went up or down) then the bot printed a message on our
sysadmin irc channel.  Simple, but allowed me room to think about a lot
more stuff.

I like the idea of a constantly running cfagent, and rather than one that
works in passes, like now, I'd like one that set alarms on all work.  You
may want to check your process table every 5 minutes, because it's
lightweight and very important, but only check file permissions every
hour, because it takes more time and is less likely to cause service
outages.

So, each feature is built like a plugin, and each plugin supports an alarm
setting, with support for a default.  The cfagent starts by parsing all of
the config files, and probably running each plugin once.  Then, it sets
each plugin's alarm (including the plugin that updates cfagent's config),
and when that alarm goes off the plugin runs again.

Obviously, this would add the need to understand better how different
plugins depend on each other (if a plugin does a test which another plugin
needs to know the answer to) but I don't think that's insurmountable.

Okay, now you've got tons of information flowing, at all times, and you
always have these classes defined or variables set.

Now, you just provide the ability for another host to ask what those
classes are or what the variables are set to.  Similar to how perl uses
$package::variable, you could do something like $host::variable or
whatever; maybe ::: instead of ::, I don't know.

But you still need a way to figure out who's in charge of reporting if a
host is down.  That should be pretty easy:  whoever cares.  For services
other than infrastructure services (dns, ntp, etc.), most hosts are only
contacted by peers providing the same services and by downstream hosts who
require those services.  At the same time, most hosts again only contact
peers, and upstream servers providing required services.

So, you just have every host care about its peers (or some subset; if a
host has 100 peers, than that's a waste of effort) and upstream servers,
and pass any important events around when something changes.  All you need
is some designated collection hosts, which also must do event
deduplication (it's easier to deduplicate than to guarantee an event only
happens once, I believe), and some placement algorithm to make sure that
there's a collection host in every event pool.

And algorithms for determining how many hops an event should be passed
around, and then ways of figuring out which events should stay in the
local pool and which should be passed, and... and...

Yes, this is my pie-in-the-sky vision for where I want my infrastructure
tools, but we're at _least_ five years out on that, unless I can get
someone to hire 2-10 people to work on it full time.

In the meantime, I think that there are some solid, important upgrades we
can make to cfengine that will suffice until that pie-in-the-sky becomes a
reality.

> Maybe setting up a RFC (Request for Comments) where the developer
> donating the feature sends an email describing it (with code if
> possible) and then if there are no opposing comments go ahead and
> implement it in the development branch, which still could be rejected
> before releasing by Mark.

That works well once the basic structure is in place, but the ideas that I
have require some significant changes to the language, and I'm not sure
that they will be allowed.  I'm not going to write a 15,000 line patch all
on my own only to have it rejected because it doesn't fit the spirit of
whatever.  And I don't want to work piecemeal towards what I think is a
better tool; I think cfengine is at a tipping point, where its popularity
and functionality are beginning to collide, and unless it has some
significant reorganization, it's going to get a bad reputation because of
how complicated some operations are.

I'd like to work to fix those problems, but I think it involves
significant code refactoring and significant function redesign.  If we can
figure out if that's our goal first, then I will be happy.  From previous
conversations with Mark, I suspect that there is less interest in
significant change, and more interest in smaller, less fundamental change.

Do the rest of the people interested in cfengine development feel that
way, or, like me, do most people feel that some significant change is
necessary?

Luke

-- 
A Chemical Limerick:
        A mosquito cried out in pain:
        "A chemist has poisoned my brain!"
        The cause of his sorrow
        was para-dichlorodiphenyltrichloroethane
                -- Adam Bernard




reply via email to

[Prev in Thread] Current Thread [Next in Thread]