Re: Locks on the Bzr repository

emacs-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Locks on the Bzr repository

From:	Stephen J. Turnbull
Subject:	Re: Locks on the Bzr repository
Date:	Sun, 22 Aug 2010 02:10:56 +0900
Jan Djärv writes:

 > > Whichever way I look at it, I don't see any upside to using bound
 > > branches, but plenty of downside.  So, I am surprised that
 > > emacs-devs choose this mode of operation.
 > 
 > It was recommended here:
 > http://www.emacswiki.org/emacs/BzrForEmacsDevs. Nobody has stepped
 > up and explained in this detail a better way.

I'm beginning to remember what the discussion was.  (Stefan's right,
life would be a lot easier if Emacs just switches to git.)

The basic problem of distributed development is that there is no
physical mainline (this is *why* conflicts can occur; even with a
centralized VCS, for this purpose we can consider multiple workspaces
as virtual -- unrecorded -- branches).  However, most projects are
organized around a mainline (or a small number of mainlines), and of
course for products with a warrantee, the supported versions
implicitly define a mainline in practice.  Darcs completely ignores
the concept of mainline, by viewing a version as a set (not a sequence
or even DAG) of abstract patches, and adding only minimal
dependencies.  git, hg, and bzr view versions as nodes in a graph, the
history DAG.  Traversing the DAG (for example, to produce a log)
requires choices in the presence of branching and merging (if there
are no merges, AFAIK all DVCSes ignore off-current-branch commits in
log and other branch-tracing operations like annotate).

git and hg of course must choose a traversal algorithm (or more
abstractly a way of converting the DAG to a total order), but this is
considered an arbitrary choice (more or less).  Aside from arbitrary
ordering decisions with respect to parallel branches in the DAG, they
only distinguish between merge nodes (ie, those with multiple parents)
and non-merge nodes.  bzr, OTOH, made a deliberate decision to have a
distinguished mainline, actually, a hierarchy of them, indicated by
the order of parents of each node.

Another important design decision in bzr (similar to hg and darcs) is
that the basic model is workspace == branch == repository.  (git
denied this model from the get-go.  git was always designed to manage
a general DAG and from the very beginning allowed different workspaces
to share an object database.)  The idea is to simplify life for the
user by having one active head per workspace, and the user switches
branches by cd'ing to the workspace with the appropriate version
checked out.  However, having only one active head interacts with the
distinguished mainline in, IMO, an unfortunate way: every new commit
is automatically on the mainline!  That is, the mainline is *workspace
[1]relative*.

Having each workspace contain a mainline means that a decision has to
be made in the case of a non-trivial merge.  Which branch's unique
commits are on the mainline?[2]  bzr's design decision is that the
*target* of the merge ("merge source into target") is the new
mainline.  This is consistent with the workspace-relative notion of
mainline that bzr was forced to adopt.[3]  In a hierarchical workflow
like the Linux kernel[4] where changes always flow from contributors
"up" a hierarchy of lieutenants (each with their own "local" mainline)
to the master integrator of "the" mainline, the workspace-relative
notion *implemented* by bzr corresponds closely to the *socially
effective* mainline of the project.

Unfortunately, this idealized workflow can only be realized if
integration is fast enough that contributors can resync to "the"
mainline after their "old" work is committed to "the" mainline, but
before committing new work to the local branch.  (Alternatively, they
can commit whenever they like, but use rebase to ensure that their
workspace DAGs are compatible with the mainline DAG.)

What happens in practice is that contributors are impatient to get new
content from "the" mainline (and perhaps from other "more central"
branches).  So they merge "the" mainline back into their local
branches and start work again.  In bzr, this is *fundamentally wrong*.
For Emacs, socially, the Savannah trunk is the mainline, and
contributors think of it that way.  *But bzr doesn't.*  bzr thinks that
this branch labelled "trunk" is just another branch to be merged into
the mainline, *which by design lives in the local branch*.  The social
mainline and the bzr mainline *must* be different!  (Except in the
case of a project run by a benevolent dictator, and even then, they
only coincide for the dictator.)

So here is the primary role of the bound branch in the workflow
recommended by BzrForEmacsDevs.  It always has the socially-determined
mainline as its bzr mainline.  If you commit something there, it
automatically is committed to the trunk repo as well.  If the trunk
repo has advanced, you can't commit in your bound branch.  You need to
update the workspace and remerge your changes, first.  (The remerge is
automatic unless the update creates conflicts.  Note that you still
need to commit your changes.)

You can achieve the same effect by pulling from trunk into your
unbound integration branch (the branch from which you push) before
doing the merge-commit-push dance there.  The bound branch does not do
an implicit update before merge or commit, although it does do an
implicit push after commit.  So the BzrForEmacsDevs bound branch
workflow might save you some typing, and since it always tries to
commit to the master branch first, you can't screw up the mainline.[5]

Footnotes: 
[1]  In git, OTOH, a record is kept of the source of a branch, and for
remote branches a tracking branch is established to represent the
mainline of that branch.  "The" project mainline is conventionally
called "master" (and of course in workspaces has a tracking branch).
In practice, this works well with a bit of discipline for most people.

[2]  In a DAG-based VCS, it doesn't really make sense to build a
mainline out of some commits from one branch and some from the other,
because that requires breaking the parent-child relationships.

[3]  It's not quite true that there was no alternative.  An
alternative would be that commits in a branch are off-mainline, and
the user makes an explicit decision as to which of the branches being
merged is the mainline at the time of the merge.  git actually had a
UI like this (all parents had to be explicitly stated in a merge
command), but it was quickly abandoned.  (Now HEAD is implicitly the
leftmost parent in all git merges).

[4]  As generally understood; in fact it really needs to work quite
differently.

[5]  In the current configuration of the Savannah trunk, you can't
screw up the mainline anyway.  Your push won't be accepted if it would
cause the trunk mainline to be subordinated to your mainline.
However, this is likely to confuse users; the bound branch is easier
to understand.
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Locks on the Bzr repository, (continued)
Prev by Date: Re: bzr smart server
Next by Date: Re: Locks on the Bzr repository
Previous by thread: Re: Locks on the Bzr repository
Next by thread: Re: Locks on the Bzr repository
Index(es):
- Date
- Thread