gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: give us a hand with arch


From: Andrea Arcangeli
Subject: Re: [Gnu-arch-users] Re: give us a hand with arch
Date: Sat, 27 Sep 2003 01:50:58 +0200
User-agent: Mutt/1.4.1i

On Fri, Sep 26, 2003 at 10:14:46AM +0100, Paul Hedderly wrote:
> On Fri, Sep 26, 2003 at 02:25:02AM +0200, Andrea Arcangeli wrote:
> > 
> > > Explicit tagging is by far the best choice.
> 
> Noooo use tagline - as it defaults to explicit when there aren't
> taglines.

Sorry but I think tagline is the worst mode. I prefer it's unable to
merge, than to see a file called hello.c to have a comemnt me.c that
could stay there forever just because ages ago it was called me.c. How
ugly it is when you go read the code? Also I'm unsure what happens if
you move it across directories.

the point is to merge the patches regardless of what's inside the files,
they could be partly binary files and I think patch works on them, there
may not be comments allowed for a /etc/ conffile, whatever, we may even
get a namespace collision on the magic arch-tag name.

Polluting all the new files sounds very very bad.

I understand this is the preferred mode suggested in the docs, but I
simply think it's not good for any real big project where stuff will get
renamed. In a small project you basically almost don't care about
renames or name space collisions.

> > I also would like a way to *enforce* it, I would like that the commit
> > would ignore everything without a tag (maybe it just work that way, so
> 
> Edit the {arch}/=tagging-method file. There are regexs there that help
> arch define what is treasured source code, what is junk (and ignored),
> what is unkown etc. Very easy to make it do what you want it to.

you're still thinking at a small project methinks.

With the kernel there are tons of exceptions, somebody wrote a dontdiff
(writing the dontdiff is the very same problem of defining what is junk)
but it has to be maintained and it keeps breaking all the time, see my
last version (and I had to stop using it because it just keeps getting
garbage every once in a while so it forces me to check every diff).
Likely the below won't work for 2.5 for example. The kernel is a big
moving target.

*~
*.a
*.o
*.orig
*.rej
*.ver
*.bin
*_MODULES
*cscope*
.*
53c8xx_d.h
CVS
System.map*
asm
autoconf.h
bbootsect
bbootsect.s
bootsect
bootsect.s
bsetup
bsetup.s
btfix.s
btfixupprep
build
bvmlinux
bvmlinux.out
bzImage
classlist.h
comp*.log
compile.h
config
conmakehash
consolemap_deftbl.c
devlist.h
dummy_sym.c
filelist
gen-devlist
gen-kdb_cmds.c
gentbl
kconfig.tk
ksym.c
ksym.h
lxdialog
make_times_h
map
map.out
mkdep
modversions.h
promcon_tbl.c
setup
setup.s
sim710_d.h
sm_tbl*
split-include
tags
times.h
tkparse
version.h
vmlinux
vmlinux.out
zImage
fore200e_pca_fw.c
fore200e_mkfirm
zImage
net/802/cl2llc.c
net/802/pseudo/pseudocode.h
net/802/pseudo/actionnm.h
net/802/transit/pdutr.h
net/802/transit/timertr.h
net/khttpd/times.h
drivers/char/consolemap_deftbl.c
drivers/char/defkeymap.c
drivers/scsi/53c8xx_d.h
drivers/scsi/53c8xx_u.h
drivers/scsi/53c7xx_d.h
drivers/scsi/53c7xx_u.h
drivers/scsi/sim710_d.h
drivers/scsi/sim710_u.h
drivers/sound/maui_boot.h
drivers/sound/msndperm.c
drivers/sound/msndinit.c
drivers/sound/pndsperm.c
drivers/sound/pndspini.c
drivers/sound/pss_boot.h
drivers/sound/pss_boot.h
drivers/sound/trix_boot.h
drivers/sound/trix_boot.h
drivers/atm/fore200e_pca_fw.c
drivers/atm/fore200e_sba_fw.c
arch/alpha/boot/ksize.h
arch/mips/tools/offset.h
arch/mips/baget/dummy.c
arch/mips/baget/balo.h
arch/mips/orion/initrd.c
arch/ppc/kernel/ppc_defs.h
arch/m68k/kernel/m68k_defs.h
arch/arm/include/asm-arm/mach-types.h
arch/arm/lib/constants.h
arch/ia64/tools/offsets.h
arch/ia64/tools/offsets.h
arch/mips64/tools/offset.h

this has no way to scale, the above has to be part of arch too to work
distributed, it can't be a local knowledge because it changes remotely
too.

The only thing that is supposed to know about this is 'make distclean'.
The above absolutely must not be checked into the tree. So if you are
ok to do:

        cp -al arch-tree arch-tree.old
        cd arch-tree
        make distclean
        tla commit
        rm -r arch-tree
        mv arch-tree.old arch-tree

then yeah, I will be able to work around it with current arch, but
please don't ask me to script the above with an alias ;). And the above
is not more reliable than distclean: sometime seldom during development
people forgets to add one file in the distclean exception list, and
having a revision control system is useful exactly to catch those cases:
to avoid checking into a tarball also the garbage and to track the
garbage reliably during development.

This is a fundamental feature I pretend from a revision control system.

Let's forget the tagging, what I need is a reliable checkin and a
reliable way to identify the garbage, it can't be a local-knowledge
regexp.

And after I have this 'strict' beahviour in the commit and inventory
commands, then I don't need to care anymore about the tagging since the
explicit one will be natural or it will simply not checkin anymore. I
can simply use a md5sum of the file xor with the gettimeofday
microseconds called once for each chaacter in the md5sum xor with the
first X bytes in /dev/urandom xor with the md5sum of my_id. That can be
slow, it only happens during a `tla add/rm/mkdir/whatever vfs op`
and during the initial import anyways.  This way the probability of
collisions with different added files in other trees will be almost
zero.  and we basically guarantee perfect merging and at the same time
we are strict and we don't risk to merge garbage.

This behaviour sounds an order of magnitude better than the current tag
methods and the current way too permissive checkin, and I believe this
is a definitely needed feature in order to deploy arch for huge trees
with tons of people working on it like the l-k.

> > explicit tagging there is no way to know what is supposed to go in or
> > not, so the beahviour I experienced is accetable with names, but with
> > explicit tagging turned on, then people is supposed to do
> > renames/additions/removal with metadata update, and as such we can be
> > strict. I like being strict. With a tree like the kernel with that many
> 
> Even better - if we start adding taglines (unique, unchanging tags) such
> as:
> 
> /*
>  * arch-tag: DO_NOT_CHANGE_b85434bac70c3f186e66c110805ec317
>   */
> 

how ugly can that be? It's a totally broken idea to put metadata mixed
with the data. Files are there to be nice to be read. we must not
pollute our eyes with metadata, that's why all the metadata is in
{arch}, so an ls as well gets the less possible pollution. I also
dislike the metadata for emacs indentation at the end of the file.

sorry but I think the tagline is the very worst method. I'd sure prefer
names to tagline ;).

It can work well in a small project where you all use arch, but with
linux and a moltitude of users using all sort of different tools the
data has to be clean, and the checkins have to be strict IMHO.

> > One last issue, is there a way to give a symbolic tag to files? The way
> > I understood it, there isn't and  I've to create a second branch and tag
> > to the previous branch. Is that correct?
> 
> If I understand you... then you're correct. Your looking through CVS
> glasses. In the arch world branches are cheap and easy, so use them.

yep I totally agree on this ;) Actually it's not just a cvs problem,
it's a problem for all others, I think including bitkeeper too. 

I'm guessing that creating a branch for bitkeepre must be very expensive
compared to arch, because one of the few things I know for sure of
bitkeeper is that its internal format is the obsolete per-file method,
SCCS. Again, I believe Larry did an huge mistake in basing a new product
requiring new brand features on a whole tree, on a obsolete per-file
format not meant to manage a tree as a whole.

Working with patches is a much superior approch.

SCCS like RCS were never built to deal with more than one file. However
while CVS lives with it, I guess Larry made it work as a whole but it
must be suffering badly from the old legacy per-file architecture,
compared to arch.

This is one more reason cloning bitkeeper (as IMHO _not_ at all wisely
suggested once by RMS) would be very a _bad_ idea, not just for the
total wasted time in reverse engeeering the protocol that could be
invested in doing something much more productive, but now also because I
feel like its core design in how it commits the data is obsolete.

> > In cvs people tends to use tags for important events like releases,
> > and losing them during the conversion would be bad. Now the night hack
> 
> Branch.
> 
> > didn't care about tags at all, I wonder if there's a way to retain the
> > tags. I guess the simplest way would be to be able to tell the converter
> 
> Branch... :O)

;)

> The Category and Branch are just for you to organise your archives
> nicely. Even the Version isn't really used much - except to get the
> latest version of Cat--Branch if you don't specifiy a version.

agreed.

Andrea - If you prefer relying on open source software, check these links:
            rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.[45]/
            http://www.cobite.com/cvsps/
            svn://svn.kernel.org/linux-2.[46]/trunk




reply via email to

[Prev in Thread] Current Thread [Next in Thread]