[Gnu-arch-users] Why we might use subversion instead of arch.

gnu-arch-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Why we might use subversion instead of arch.

From:	Pierce T . Wetter III
Subject:	[Gnu-arch-users] Why we might use subversion instead of arch.
Date:	Fri, 20 Feb 2004 10:45:39 -0700

Note: The purpose of this email is not to rile you guys up. Its for meto document my findings about arch, so that either you can correct myerrors, or so you can improve arch, since I really liked thedistributed nature of it.


 Background:

Like many people we use CVS for our version control, because assomeone said once, "CVS sucks, but it sucks less then anything else".However, Subversion is reaching 1.0 status, so I decided it was worthchecking out some alternatives. Pretty much, that came down to twochoices, svn and tla.


 Our setup:

We have a lot of distributed employees, and also employees whotelecommute. Worse case is me in Flagstaff, AZ on dialup talking toRaleigh, NC. Our current CVS repository is about 300MB with all thehistory.


 Our work process:

There are two main cases: I (Pierce) tend to make lots of smallincremental changes, because I do the UI. Mike tends to make lots oflarge changes, since he works on the backend servers and needs tochange both our object model, and the servers in one pass. I'll callmyself "incremental_guy".

So for me, checkout, edit, update, checkin works great. So I actuallyam perfectly happy with CVS.

For Mike, he wants to branch, move everyone else's HEAD changes intohis code, then check back in. What he does now is just have severalcheckouts running in parallel all the time, which is actually similarto arch. We'll call Mike "batch_guy".


 Why arch would be cool over subversion:

Since there's no concept of a "central" repository, at best a"blessed" repository, we could do stuff like the following:

Everytime we code freeze for deployment, we copy "blessed" to"deployment".

When developers have changes, they merge them into the "deployment"repository if they're bug fixes for the deployment, along with"blessed" so that there is a local copy. This is really the same thingas a deployment branch, but conceptually it seems easier, and it wouldavoid problems we have where fixes don't quite make it into thedeployment build.

If two engineers need to work together, like if "batch_guy" needsto work with "incremental_guy", no one else has to be involved, theycan just merge their changes together.

Since we have a system of servers, clients, etc. most developersend up having several machines they have to keep in sync. With arch,your local test server could check out from your personal repository.


 How we would have to setup:

Well, first, every developer would end up needing to have a networkaccessible "master" archive. Since arch doesn't have any concept of aserver process, that means setting up a web dav server with multiplesubfolders:


   /archrepositories/incremental_guy
   /archrepositories/batch_guy
   /archrepositories/blessed
   /archrepositories/development

Predominantly, mostly developers would use the/archrepositories/development repository as "truth". You'd only needyour "personal" archive if you needed to work with someone elseindependently of the archive.


 Now for the bad stuff:

Ok, so I tried experimenting with arch. The first thing I did wascheck out something from a public arch repository. I got quite a shock.Evidentially, every arch repository stores the "base code", thenfollows that with a series of forward patches. This is quite differentfrom most other version control systems, which store the head versionas "truth" and then keep reverse patches going backwards. The neteffect of this is that checking out that version required downloadingnot just the latest code, but downloading all the patches in between.

That was quite a shock. For projects with lots of small changes, itprobably is inconsequential, but for me, on a dialup, it would reallysuck. Now I read some stuff on the wiki about how you can make all thatfaster by making a new archive (which moves the base), but I shouldn'thave to change my work process to make the version control systemefficient.

The next thing I noticed was that while CVS and Subversion let youstructure your projects and sub projects via the filesystem, archreally tries to grab the whole filesystem as one unit. You can overridethis a bit, but it involves setting up some config files. Config filesthat are kind of poorly documented (based on the fact that I couldn'tmake heads or tail of the explanation). This makes a lot of sense foropen source projects focused on a single executable, but makes muchless sense for us. I suspect most people deal with this but just havinglots of arch repositories:


   /archrepositories/blessed/tool
   /archrepositories/blessed/library
   /archrepositories/blessed/application

 But that would be a nightmare for us.

The next thing I found was that it was SLOW. tla is kind of bruteforce, and all that diff-ing, tar-ing, and compressing can take quite awhile.

So at this point, while the distributed repository stuff was cool, Ihad to conclude that arch works best for working on open-sourcedevelopment where you don't submit code so much as you submit patchfiles, and you need to merge patches from multiple places. From thatpoint of view, arch is great. From ours, ugh.


 How I would improve arch:

Fundamentally, I think that arch should store HEAD, with reversepatches, rather then START with forward patches.


   The rsync protocol would make more sense then webdav or ftp.

Improve the documentation, especially needed is a section with somearch concepts, so that you don't have to pick up everything by osmosis.

While tla is ok as a low-level tool, I've observed that everyonekeeps trying to replace it with a driving script. That's a goodinstinct. For one thing, I think that:


  user--archive--task

  is harder to read then:

 tla make-archive --id address@hidden  --name archive

 tla archive-setup --project hello-world --branch mainline --version 0.1

It would be a trivial change to tla to support passing archive namesas individual parameters, but I think it would flatten the learningcurve of arch. Especially since I think that if you break up the names,you can realize that it would be pretty easy to come up with standarddefaults for most of these, such that you only have to type:


 tla archive-setup --project hello-world

 because branch defaults to "mainline", and version defaults to 1.0.

Or perhaps the project name could even be taken from the currentworking directory, so all you would need is:


 tla archive-setup

 Similarly:

  tla get --project hello-world  hello-world-Alice

Would try to get hello-world--mainline--HEAD, where HEAD iscalculated such that 1.50 is known to be farther then 1.49


  Anyways, basically, I'm trying to make the following two points:

blah--blah--blah may be convenient to type, but its hard tounderstand, especially because depending on the context, sometimes thefirst position is the user id, sometimes its the project, etc. It wouldmake a lot of sense to make the components explicit (and update thetutorial), because it would flatten the learning curve. Tla could stillaccept the blah--blah--blah format as a short cut.

tla has some naming conventions in practice, but none of them aredefaults in the code. By installing those naming conventions asdefaults, you can also flatten the learning curve. You can also supportadditional features for those defaults. For instance, one of theannoying thing for me about learning tla was that its made of lots oflow-level operations so I have to translate my high level "what I'mdoing" into a whole set of tla commands. Something like:


 tla branchstart --task "fix_for_bug" --master master_repository

--- this starts a branch off of a remote repository, with branchname fix_for_bug, version 1.0.

 tla branchupdate
      --- grabs HEAD changes from master
 tla commit --local
      --- commits changes to branch locally
 tla commit
      --- uploads changes to remote master
 tla branchdone
      --- merges changes back to mainline in the remote master

Would be much easier to understand. In fact, in general, I'd like tosee all the low-level commands in tla supplanted by high-level commandsbased on the use cases.


 Something I'd also like to see that I implied above:

   --local commits to the local repository.

--remote commits to both the local and remote repository. While tladoesn't currently have any concept of a "master" repository, I think itmakes sense that the high-level commands would support this conceptthat you have local archives you can commit to all the time, with aremote archive you commit to less often.

Comments appreciated. I'm getting this list in digest mode so if yourcomment is "urgent" email me directly.


pierce

[Prev in Thread]

Current Thread

[Next in Thread]

[Gnu-arch-users] Why we might use subversion instead of arch., Pierce T . Wetter III <=
- Re: [Gnu-arch-users] Why we might use subversion instead of arch., Tom Lord, 2004/02/20
  - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Pierce T . Wetter III, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Robert Collins, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Pierce T . Wetter III, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Brian May, 2004/02/22
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Miles Bader, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Tom Lord, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Pierce T . Wetter III, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Miles Bader, 2004/02/20
    - Re: [Gnu-arch-users] Why we might use subversion instead of arch., Pierce T . Wetter III, 2004/02/20

Prev by Date: Re: [Gnu-arch-users] Re: tag'ged (branched) and cacherev'ed archive dependency?
Next by Date: [Gnu-arch-users] ancestry and star-merge
Previous by thread: [Gnu-arch-users] ping (round trip test)
Next by thread: Re: [Gnu-arch-users] Why we might use subversion instead of arch.
Index(es):
- Date
- Thread