Hi,
I've raised this issue on this post:
http://lists.gnu.org/archive/html/help-cfengine/2004-07/msg00113.html
In order to eliminate the "reversion botch" and keep the environment
equal I'm trying to avoid editing files based on current patterns. I
always tell cfengine to empty the file and insert needed lines.
If current file is exactly the same as the result of the editfile action
, cfengine doesn't do anything.
Zeev
-----Original Message-----
From: help-cfengine-bounces+zeevf=marvell.com@gnu.org
[mailto:help-cfengine-bounces+zeevf=marvell.com@gnu.org] On Behalf Of
Alva Couch
Sent: Saturday, November 19, 2005 3:43 AM
To: help-cfengine@gnu.org
Subject: Re: convergence and undoing changes
Mark Burgess wrote:
> You are correct however in pointing out that users CAN screw this up
> by trying to be too clever, by not thinking convergently.
This is what I am getting at.
> But that is not the normal state of affairs.
My experience is that users are all too cavalier about the way they
modify cfagent.conf. I think a specific discipline -- unknown to many
users -- is the key. We can either document that discipline or
encapsulate it in some kind of transaction engine. I propose to do both.
My examples using editfiles are a matter of public record. But the
problem can even happen when one utilizes purely convergent actions.
Here's a "typical" example of user thinking.
- user asserts contents of a file F. Say it is a service startup
in /etc/xinetd.d and the intent is to customize some service.
- then, some time after F is stable, the user changes the assertion
to revert F to its original state.
- unbeknownst to the user, some different set of stations are down
while F is reverting to the original state.
- then, satisfied that the file is reverted, the user takes the
reversion assertion out of the script, considering work to be done.
- time passes and the unreverted machines come back up. There is
no reversion to affect them. So they stay with the new version.
- At this point, there are two classes of machines: those with
the original version of F and those with the new version. If the
new version has a security hole, congratulations, you didn't manage
to plug it.
The key here is that for reversions to be effective, they must stay
in the configuration until it is absolutely sure that all stations
have applied them. In a very large network, one is likely never
sure, so one can *never* remove the reversions from the config file.
This is the principle of observability:
Once one manages a thing, one must continue to manage that
thing in perpetuity.
In my experience this kind of "reversion botch" is very common.