gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] What are version numbers?


From: Tom Lord
Subject: Re: [Gnu-arch-users] What are version numbers?
Date: Fri, 12 Sep 2003 08:51:16 -0700 (PDT)


Just as a point of interest: there is another subtle yet, I think,
significant way in which the structured namespace pays off:
disjointness and approximate disjointness with other namespaces.

For example, the set of archive names and the set of version names are
disjoint: the two sets have empty intersection.  Consequently, if the
argument to a command (for example) can be one or the other -- the
syntax let's you tell which has been provided.  (Similarly, it's a
vestigial _weakness_ of the design that the set of category names and
the set of package-branch names (e.g. "category--branch") are _not_
disjoint.)

Any unix filename can be written so that it doesn't pun as a version
or archive name.    

Most distributions (e.g. *.tar.{gz,bzip}) have names which can not be
mistaken for archives or versions.

In an email message, if I write some--string--3.1 you know at once
that that is the name of an arch version.

It has been pointed out that buildcfg takes advantage of the ordering
of versions and structure of names.   That's true.

So do the commands: 

        push-mirror
        abrowse (and branches, versions)
        archive-snapshot
        logs
        merges
        (ls-)pristines

and probably others.

In my experience working with a significant number of archives, the
category--branch--version structure has paid off very nicely, allowing
me to navigate unfamiliar archives with ease, restrict the scope of
push-mirror easily, and so forth.

Along those same lines, I think that the structure works out
pleasantly in browsers and the only replacements I can imagine (e.g.,
a series of per-project regexps that describe how free-form names are
structured) seem excessively complex to me.

And, again, one of my goals for arch is to make it a good tool for
"programming in the large" -- for tasks such as managing a complete
GNU/Linux distribution using far less labor than is currently
required.  Arch isn't the entire solution to that, obviously -- but I
think that it's a good contribution.  Uniform, structured naming of
development lines and revisions is critical to such a task.  Far from
seeing the arch namespace de-structured, I look forward to it's spread
and integration into other process tools, such as bug trackers,
testing tools, update managers, patch queue managers, system auditting
tools, and so forth.  And yes, I know, "you'll never get all those
projects to agree on that" but I also know that that's only true until
it isn't and then it isn't true anymore.  New conventions have spread
in the past and new conventions will spread in the future.  (Can one
_force_ new conventions to spread?  Of course not -- it's a crap shoot
-- but one can at least try to load the dice.)

The idea of supporting options (such as category--version--branch)
would add complexity to the code, for very little practical gain.

I find this supposed argument for free-form names silly:

    > # create a project
    > tla archive-setup super-project--devel--0.9

    > Now I can create a project within the same category but totally unrelated:
    > tla archive-setup super-project--devel--1.0

    > The only relation betwen them is the name I choose. Arch does
    > nothing with the category to force the projects to be
    > related. It's you who, with the names you choose, create that
    > relation. Nevertheless, you are forced to create
    > cat--branch--version for no gain,.


If these two projects are "totally unrelated" and if the version
number field of a name has no value, then why in the world did you
give them names that differ only in the version number field?

Perhaps by "totally unrelated" you mean that they have no common
ancestry.   Well, indeed!  There are _two_ graphs of revisions: one
extensional and one intensional.   The extensional graph (as displayed
by `ancestry', for example) shows you how each revision in constructed
by a path from base revision through changes and tags.   The
intensional graph (as displayed by abrowse) is there to capture
information that could not possibly be discovered from the ancestry:
it's there to record the intension of arch users about how a big heap
of revisions is organized into _logical_ development lines and how a
big heap of logical development lines is organized into projects.

Let's make that concrete.    ACME corp may start and release a 
`super-project 1.0'.    They create the arch version:

        super-project--devel--1.0

with an imported base-0 and a series of simple changesets, eventually
releasing their 1.0 project.   Then they make:

        super-project--devel--2.0

which starts with a tag of 1.0 and adds changesets, eventually they
release version 2.0.

So far, the extensional and intensional categorization are essentially
the same.

But now version 2.0 flops -- way too many bugs.   For 3.0, ACME corp
has acquired MEGA corp and intends to scrap their old code base, but
turn one MEGA's products-in-development into Super Project 3.0.
They create

        super-project--devel--3.0

whose base-0 is a fresh import of MEGA corps code.

Now the two graphs have diverged.  Extensionally, 2.0 and 3.0 are
unrelated.  Intensionally, the 3.0 development line follows the 2.0
line -- there is a continuity of intension.

The intensionality shows up in arch-related interfaces all the time --
not in big dramatic ways, but just in quiet little ways that add up.
`abrowse' output is usefully sorted, for example.  `push-mirror' can
be restricted to just one of the three logical development lines, to
name another example.

One can argue, and not be absolutely wrong, that the intensional graph
is "policy".  It is not overwhelmingly useful for _some_ projects that
want to use revision control -- that's true.

But let's not forget three things:

(a) For non-trivial, longer-lived software projects -- the structuring
    of the namespace in arch accurately reflects what are _by_far_ the
    most common patterns of how work is organized (and have been for
    decades, at least).

(b) For projects that _really_ don't need branch labels and version
    ids, it's a simple matter to not use them.

(c) We're about one quarter year away from the second anniversery of
    the release of arch.  A subset of each wave of new users finds the
    namespace a fun topic to complain about and design kibbitz.
    Mostly, it tends to fade away.  Most speculation that this issue
    is really a _serious_ obstacle for new users is not born out
    empirically.

    There has been an important exception to those "most"s: support
    for N-component version ids.  It used to be that all version ids
    had to have two components, "MAJOR.MINOR" (e.g. 3.1).  Now they
    can have any positive number of components (e.g., 3, 3.1.4.1,
    etc.).  So I don't mean to shut down the topic or discourage
    kibbitzing entirely -- just, perhaps, to improve its quality a
    bit.


-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]