gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] git foo


From: Martin Langhoff
Subject: Re: [Gnu-arch-users] git foo
Date: Fri, 14 Oct 2005 09:14:06 +1300

On 10/14/05, Thomas Lord <address@hidden> wrote:
> Yes, name-independent file ids are critical to rename-handling
> smart merging.  The arch inventory system is a *pretty good*
> framework for that and there seem to be no serious
> other contenders.

As a user, Arch's inventory system barely gave me basic rename
handling and was a huge inconvenience. It didn't handle anything that
happened outside of its view, and it didn't manage many things that
happened within its view, but were too "corner-casey". In short, it
was a burden and barely a tool.

Now, consider these 2 scenarios, none of them theoretical, as they
have happened to me recently.

New file scenario.

I make a checkout of upstream project foo, which I'm excited about. I
add a script (new feature), hack on it over perhaps 10 commits. Looks
ready, so I post it to the mailing list. In the next few hours, I have
a couple of doh! moments and fix a couple of corner cases in the
script, 3 more commits. Maintainer likes it the file I've posted, and
adds it to his repo.

Now I brace myself, I just know that update+merge is going to throw a
conflict. After all, I just posted the file to the ml, without much
metadata... Oops, seems I'm all wrong! GIT merges it cleanly and knows
that there is no conflict, and that there are 3 commits that aren't
upstream. It has seen that I had the new file already, and walked the
file history and figured out the right thing to do. It still knows my
history is different from upstream from the moment I added the early
version of the file, but if I ask it "where did we diverge" after the
merge, it knows that the _trees_ diverged 3 commits ago (or converged
4 commits ago) regardless of history.

And the best part of this is that it does not depend on both parts
using the same SCM. If upstream is using CVS, and I am just using a
cvs2git gateway (which I'm doing for several projects), the merge to
my local branch has the exact same smarts.

The "patches echoed back" scenario

Now that I've told you that I track upstreams with a cvs2git gateway,
you can imagine that I maintain customized projects inhouse, and push
some patches upstream. Sometimes I have cvs access myself, sometimes i
just post them in bugtrackers and hope for the best.

Patches that are applied "as is" upstream are detected cleanly by git
as "already applied". Of course, modified patches conflict, just as
I'd expect. If the patch was modified upstream, I probably need to
back out my patch and merge upstreams, or at least decideon the
matter.

My conclusions as a user:
 + the only identity of a patch is its content,
 + the only identity of  a file is its content,
 + arbitrary identities (as in tla's inventory) are a pain to
maintain, brittle and of limited use
 + a _lot_ of interesting things happen outside the SCM. Even for a
heavy SCM user like me, any SCM that assumes that every code change
must travel through its pipes is broken, specially in the FOSS space.
A user posts a patch to a ml, and 3 co-maintainers merge it into their
trees untouched, a 4th edits it a bit., and a 5th is on holiday and
doesn't merge anything. When the maintainers cross-merge their repos,
git will recognize what's happened. It's not a corner case, its your
normal FOSS bazaaarish workflow. Enforcing strict SCM control of all
bits of sourcecode is for heavily strctured code development houses
with thick operations manuals.

(note: I'm still using arch a lot, and I'm very thankful of Tom's work
in the SCM space; we wouldn't be here without his efforts. OTOH, the
more I see git in real life usage, the more I disagree with
inventories and all-knowing SCMs.)

off to work now,


martin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]