[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Some issues
From: |
Colin Walters |
Subject: |
Re: [Gnu-arch-users] Some issues |
Date: |
Wed, 09 Jun 2004 21:29:48 -0400 |
Now, a more thorough reply:
On Wed, 2004-06-09 at 10:03, Florian Weimer wrote:
> * The changeset format is defined relative to GNU patch and GNU
> tar. These data formats are still somewhat in flux.
Can you back this up? I have never heard of any problems.
> The changeset format does not handle binaries efficiently,
The changeset format supports them as efficiently as possible.
Changesets are intended to be used like patches are - i.e. you can send
and retrieve them as self-contained entities.
But sure, delta-compression is something that could be done with a smart
server, as has been discussed in the past. There's no reason this would
have to break backwards compatibility.
> * and certain text files (e.g. XML files not created by a text
> editor and formated for readability).
It wouldn't be hard to imagine extending the changeset format to include
a delta from a higher-level tool that knows about the file format, *in
addition* to the regular GNU diff. That way if there is a conflict, the
user's tla could optionally call out to an external program which would
make use of this information. Otherwise, they just get the plain diffs.
> * In essence, an archive consists of concatenated changesets,
> which are directly exposed in a file-based interface. This
> makes it very complex to address issues with the changeset
> format itself, and the archive interpretation might change
> when new versions of patch and tar are installed.
This is just be a reworded version of your first point, which has
already been addressed.
> * Arch does not implement a distributed system. For example, its
> archive replication does not transparently handle write
> operations.
Is this just a really obtuse way of saying "cacherevs for older
revisions aren't automatically mirrored"? That's an easy to fix bug,
and I think it already has been.
> * There is no integrated mechanism to atomically commit related
> changesets to two branches (even if these branches are
> contained in the same archive).
I can't imagine a use for this at present, but you could write a little
script to do it using lock-revision, and add a --unlocked argument to
commit which makes it assume the revision has already been locked.
Should be about 30 minutes worth of work at most.
> * Categories, branches, and versions are not orthogonal at all
> and add unnecessary complexity. Future features cannot
> differentiate between them because they are used very
> inconsistently in existing archives.
This is way too vague. Do you have a concrete problem?
> * The idea to automatically subject files to revision control,
> based on regular expressions, is very hard to deal with for
> users. While being an interesting experiment, it does not lead
> to increased usability.
You don't have to - just use tla add, and ignore the warnings about
untagged files matching the source regexp. Or just delete the warning
in your copy of tla. Really, this is just a trivial UI issue.
> * GNU arch does not support a centralized development model
> which lacks a single, designated committer.
This has been thoroughly debunked.
> Branch creation is not versioned.
Is this a problem?
> * Branches cannot be deleted.
I don't think this should be possible.
>
> Please note that while these issues are likely too fundamental to be
> fixed in GNU arch without breaking backwards compatibility,
Actually *none* of the issues you have raised have been unsolvable
without breaking backwards compatibility, as discussion has shown.
> Implementation Issues
Most of these are just bugs, as you know.
> * The access methods for remote archives are subject to a lot of
> round trips. Therefore, archive replication using tla itself
> is very slow.
I believe pipelining is already implemented for SFTP, someone just has
to do it for HTTP.
> * The archive format optimizes for access to early versions, not
> most recent ones as one would expect. (Once the archive format
> is no longer exposed directly, this becomes an implementation
> issue, not a design issue.)
This should be solved with the backbuilder, along with a little cron job
to cache revisions.
> * The caches which compensate the previously mentioned issues
> are not expired by tla. (This includes revision libraries and.
> apparently, pristine copies stored inside a checked-out copy
> of a revision.)
Easily implemented via cron.
> * Changesets are tar files. They cannot be posted easily to a
> mailing list for approval and commit; metadata tends to get
> lost.
Umm...you can post tar files to mailing lists. People do it all the
time.
> * In practice, tla requires four inodes per file in a
> checked-out project tree: one for the file, one for the file
> ID, and a a pristine copy of both. This gratuitous use of
> inodes can cause problems.
What problems?
> * A checked-out revision of a branch contains at least one inode
> for each revisions that was ever committed in the history of
> the branch. Long-running branches also result in huge
> directories with lots of entries.
Nope - you can delete patch logs.
> * The inventory code can create inconsistent results. For
> example, explicit tagging only overrides classification based
> on regular expression in some (but not all) parts of tla.
Just a bug, if it still exists.
> * The inventory constructor, project tree checker, and changeset
> creation code are not fully synchronized. For example, it is
> possible to commit a changeset with an inconsistent inventory,
> which is also inconsistent as a result.
Just another bug.
> * Branch creation is very cheap (a few inodes in the archive),
> but a long-running branch to which changes in a mainline
> branch are periodically merged replicates all changes on
> mainline. This means that branch maintenance costs are
> controlled by the amount of development on the branch and the
> development on the mainline, and branches are no longer very
> cheap in total. (This is an implementation issue because
> unlike other systems, merge tracking does not depend on the
> way changesets are combined in the archive. This is actually a
> very strong point of GNU arch.)
I think this would be possible to solve by using something like
"interdiff" to only store the differences relative to another changeset.
>
> * The GNU arch developers believe that it's easy for all
> developers participating in a project to publish a repository.
I don't know how arch could possibly make it easier. What do you
propose instead?
> * Genuine support for centralized development is required, but
> GNU arch is unlikely to provide it.
You keep repeating this. It is completely false.
> * The tendency to trade decreased code complexity for increased
> running time and more disk space was fine when tla got
> started, but today, it results in performance that does not
> compare favorable with optimized competitors. In addition,
> disk seek times have not improved at a significant rate, and
> the huge amount of stat operations performed by tla will
> remain a bottleneck even when developers move to larger
> machines.
Sure. The inventory code could be optimized.
> The developers seem to underestimate the need for a robust
> user interface with clear error messages
A number of these are already fixed, waiting to be meregd.
> and transaction semantics (i.e. a command either fails and
> changes nothing, or it completes successfully).
tla should unlock a revision it locked when an error occurs, yes.
> tla input and output formats are currently deliberately
> incompatible with the rest of the GNU system.
Yeah, the pika encoding seems like crack to me too.
> *
> Redesign the changeset format, probably based on VCDIFF (RFC
> 3284). Unlike unified diffs (which are currently used by tla),
> VCDIFF deltas are one-way and not reversible when just the
> delta itself is known.
But then you propose more crack :)
> * (this is not so much of a problem, tla uses changesets only in
> forward direction most of the time).
> *
Even if tla itself didn't use changesets backwards, users will want to.
And with the backbuilder, tla will do it very often.
>
> * Provide a human-readable changeset format with complete
> metadata. This format is intended for exchange of patches over
> mailing lists and should include unified diffs.
As has been discussed a lot in the past, it would be nice.
> * Do not expose the archive format, but use a changeset server
> which implements access control (and pipelining, to cut down
> effects of network latency).
I don't think a server should be required, but it would be nice to have
as an option.
> * Project trees should not abuse the file system as a database.
> If a database is required, use a real one (such as BDB or
> SQLite), or CSV files containing multiple records, but not one
> file per record.
I think this would be nice to have too.
> * Use a file cache (with LRU logic) instead of revision
> libraries.
Why would you want that? Most of the time you're going to be comparing
complete revisions. I suppose it might be useful for a file-oriented
web-based arch browser though. Certainly a file cache doesn't replace
revision libraries.
signature.asc
Description: This is a digitally signed message part
- Re: [Gnu-arch-users] Some issues, (continued)
- Re: [Gnu-arch-users] Some issues, Matthieu Moy, 2004/06/09
- Re: [Gnu-arch-users] Some issues, Michael Poole, 2004/06/09
- Re: [Gnu-arch-users] Some issues, Florian Weimer, 2004/06/09
- Re: [Gnu-arch-users] Some issues, Miles Bader, 2004/06/09
- Re: [Gnu-arch-users] Some issues, Florian Weimer, 2004/06/10
- Re: [Gnu-arch-users] Some issues, Tom Lord, 2004/06/15
- Re: [Gnu-arch-users] Some issues, Tom Lord, 2004/06/15
- Re: [Gnu-arch-users] Some issues, Tom Lord, 2004/06/15
- Re: [Gnu-arch-users] Some issues, Tom Lord, 2004/06/15
- Re: [Gnu-arch-users] Some issues, Andrew Suffield, 2004/06/15
Re: [Gnu-arch-users] Some issues,
Colin Walters <=
- Re: [Gnu-arch-users] Some issues, Aaron Bentley, 2004/06/09
- Re: [Gnu-arch-users] Some issues, Matthieu Moy, 2004/06/10
- Re: [Gnu-arch-users] Some issues, James Blackwell, 2004/06/10
- Re: [Gnu-arch-users] Some issues, William Dode, 2004/06/10
- Re: [Gnu-arch-users] Some issues, James Blackwell, 2004/06/10
- Re: [Gnu-arch-users] Some issues, Tom Lord, 2004/06/15
Re: [Gnu-arch-users] Some issues, Tom Lord, 2004/06/14