Re: [Monotone-devel] RFC: CVS sync design

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] RFC: CVS sync design

From:	Christof Petig
Subject:	Re: [Monotone-devel] RFC: CVS sync design
Date:	Wed, 05 Jan 2005 15:40:30 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux ppc; de-AT; rv:1.7.3) Gecko/20041007 Debian/1.7.3-5

Nathaniel Smith schrieb:

Actually, let me throw another thought out here to: Is it reallyuseful to commit whole chains of history to CVS? It would besignificantly simpler from our point of view to just take a singlerevision, double-check that it's a descendent of the last sync'edrevision, and then just commit that snapshot to the CVS repo,completely ignoring intermediate versions.


If you think of CVS as being still the main repository for a project you
might want to preserve as much information in it as possible. [e.g.
intermittent tree states, especially commit logs etc.]

If somebody else committed while you created the (monotone based) fork
you have to merge the trees again before committing. Since CVS has no
notion of side branches your merged version will _immediately_ follow
the last commit on the main trunk (even in monotone). So unless you are
the only one doing changes to the whole tree while working offline in
your monotone database your changes actually have to land as a single
revision in CVS.

That's by design.

[Committing/Preserving side branches as CVS branches might be a separate
option (you'd need to specify the monotone head (by revision ID) and the
CVS branch name)]

The Monotone revision graph is going to be a superset of the stuff in
 CVS anyway, so I don't feel too bad about collapsing some edges.

Also, this is much simpler to implement, completely sidestepping like
the last 3 paragraphs of my last email. (Also some things I didn'tmention... like what do we do with discontinuous branches? I.e.,branches that have 'gaps', like A -> B -> C where A and C are in some
 branch but B isn't... should B be pushed to the CVS server?)


Actually being able to commit offline while showing my commit history to
co-workers which still use CVS only is one of my motivations for writing
this beast.

Also, it might actually be required for some cases. E.g., a typicaluse case for CVS-synching functionality might be that if I'm workingon a project whose upstream uses CVS, then I want to use Monotonelocally to work out my changes, and then when I'm done push them tothe CVS repo. But Monotone and CVS have very different criteria forwhen a change should be committed; in Monotone it's perfectly commonto commit a change just locally, as a checkpoint while working. InCVS, though, commits always affect everyone globally, so you have
many projects where the policy is that if you commit a broken
revision, you are a Bad Bad Person.  Even if you then immediately
commit a fixed revision.  So I will be violating upstream policy if I
simply push my full Monotone graph into CVS!


Causing Monotone to collapse edges even when not necessary (see above)
might be a later option, this is not my initial motivation. Perhaps some
trickery will do it anyway (commit one last change (e.g. changelog,
NEWS) to the base revision, then merge the two heads).

In a different mail N.S. wrote:

Cool, I think this would be really useful for a lot of people.

:-D

Do not store state information in both trees so that syncing with
several CVS servers is possible.
I don't understand the connection between these statements.  We could
certainly store in Monotone one cert that said "this revisioncorresponds to a checkin in CVS repo Foo, whose state was ...", andanother that said "this revision etc. in CVS repo Bar, whose statewas ...". Storing state seems like it could significantly reducecomplexity, be very useful for later spelunking (cf. how subversion's
 CVS->SVN code stores things like CVS version numbers as metadata,
just in case they're useful later), and I don't see any immediatedrawbacks...

I will code a remote CVS import first and then try to do a update(cvs_pull). If it shows up that this is too difficult or inefficientunless you store marker certs in your monotone db then I will give in.[Actually you have to sync initially without this information anyway, sothe code has to be there]

And to be honest I have some projects which reside on multiple CVSservers (one is the master of course). Easing this pain seems to betrivial with a CVS<->monotone<->CVS setup.

Preserve changelog and timestamp of every change.



And author?  Or does CVS not let you set that?  (Monotone will let
you set the author field on commits to arbitrary strings, so _that's_
no problem.)

I can preserve the author in monotone. I can't tell CVS to not take thelocal user name for any checkins.

syntax: monotone pull [--branch foo]
cvs://localhost/usr/local/cvsroot module[:branch]
I would strongly prefer that this functionality not overload themeanings of push/pull/sync. Synchronizing with a CVS repository is asignificantly different process than synchronizing with anotherMonotone repository. Maybe cvs_pull/cvs_push or something?


should we mimic CVS by introducing a -d swith?

Unless someone proposes a really convincing alternative I will stay withmy syntax (I still think that it's semantic is similar enough to theoriginal sync and I already introduced an URL (ssh://...) on the sshbranch).

Explicit is better than implicit. I think we should just make theuser specify the desired correspondence between Monotone and CVSbranches; the two systems are different enough that there's really no
 good way to guess.  (I'd even be fine with requiring them to type
HEAD when they wanted the head branch, rather than defaulting.)


Agreed.

This seems like the time when keeping some state would be reallyhandy. What about having a cert that says "this revision corresponds

 to the the following files in CVS repository ___: file1  1.3 file2
1.8 ..." (I guess a problem here is what namespace to use for CVS
repositories. I guess I don't have any useful intuition here, since I
can't even think of a situation where one would want to synchronize
with two different CVS repos...)

Then the pull operation becomes: 1) traverse up from the branches
heads until we find such a cert. (If we don't find such a cert, then
we start from the beginning.) 2) having found such a cert, we simply
request deltas forward from each revision mentioned in the cert until
the revisions in the current tip of the branch.

The initial pull would still be different (unless we require themonotone branch to be empty) and more similar to my proposal. (see above).

The push command will be an alias to sync because to check into a
CVS repository you need to have an up to date copy of it. [As we
surely all know ;-)]



I'd rather not have a 'sync', and instead have 'push' fail if commits
 have occurred since the last 'pull'.  This - matches the normal CVS
semantics for update/commit - is much less surprising than having
'push' actually do a 'pull' And 'sync' isn't useful anyway, because
when you do a 'pull' and then immediately do a 'push', at least one
of them will always be a no-op. (If the 'pull' is a no-op, the 'push'
will succeed; if the 'pull' actually pulls a new revision, then there
will be nothing for 'push' to do, because that revision will have no
children to be pushed.)

Agreed, but a sync might issue the necessary pull of recent changessince the data is needed anyway to determine the last revision in CVSfor the push. (See my motivation about being able to push the wholehistory). So writing a push without pulling first is more complex(unless you save the state of the CVS tree, of course).

To I think clarify this, and suggest something _slightly_ different,here's my version of push: - find the latest revision that
corresponds to a cvs-manifest - check to see whether that
cvs-manifest is the tip of the branch we wish to sync with; if not,
error out, telling the user to perform a pull and do some merging -
now pick a child of that revision, commit it to the CVS server, and
recurse
The only tricky part is choosing the children to commit; this is theold 'pick a distinguished linear subbranch' problem. Some
strategies: a) pick randomly b) let the user choose the revision to
end up with, and pick a random path to get there c) recurse only so
long as there is a linear path to follow, and then stop when we reach
the first fork d) check ahead to see whether there are any forks, and
if there are, abort early and tell the user to specify explicitly
which revision they want to push to the server (this is similar tomonotone's 'update' command). There must be a unique (linear) path
from the CVS tip to that revision.

I tend to choose c) (see above) e.g. the user has to specify which pathto take when ambigious and not the HEAD to be.

It seems like some desireable properties are: 1) the user doesn't
have to do n push's to send n revisions to the server.  (So you want
to push whole chunks of the graph at once, at least sometimes.) 2)
you want to be able to specify which revision ends up as the CVS tip3) you want to be able to specify exactly which revisions arecommitted (i.e. both which revision ends up as the CVS tip, and which
path is taken to get there)
I think in practice (b) is best. The only advantage of (c)/(d) over(b) is that they force you to specify the exact intermediate
revisions to commit, i.e. they prioritize (3) over (1). In most
cases, though, most people won't care exactly which revisions are
committed, so long as you end up with a branch tip that has all the
changes in it.  I.e., (3) is more important than (1).  So (b) is
better than (c)/(d).

(b) "Picking a random path from A to B" gives you less control and ismore difficult to realize than (c). So I will start with (c)

This still leaves the question of, if there's more than one head andthe user doesn't specify which one to end up with, do we abort andforce the user to pick one, or do we pick one randomly?

I tend to push while possible and then tell the user to specify whichpath to take on the commandline (using a 'heads' like display). Iteratethat and you push as much history into CVS as possible without creatingside branches. (this also avoids the problem of specifying multiple sidebranches to walk)


   Christof

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Monotone-devel] RFC: CVS sync design, Christof Petig <=
- Re: [Monotone-devel] RFC: CVS sync design, Nathaniel Smith, 2005/01/11

Prev by Date: Re: [Monotone-devel] db query error on propagate, monotone 0.16
Next by Date: [Monotone-devel] Monotone serve colection
Previous by thread: [Monotone-devel] db query error on propagate, monotone 0.16
Next by thread: Re: [Monotone-devel] RFC: CVS sync design
Index(es):
- Date
- Thread