emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Goals for repo conversion day


From: Eric S. Raymond
Subject: Re: Goals for repo conversion day
Date: Sat, 25 Jan 2014 16:01:32 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

Eli Zaretskii <address@hidden>:
> > *I* object to this.  On the grounds that I've been through this dance
> > many times before, and know that such out-of-band representations
> > generally cost more hassle and deliver less than people expect when
> > they think them up.
> 
> With all due respect, this is not necessarily good enough.  You have
> come to the project offering help, but no one gave you the right to
> make unilateral decisions about these issues.  It won't be you who
> will need to use this data in the years to come.  And whatever the
> other projects which you converted in the past, I doubt that any of
> them had as long and complex history as Emacs.

Your doubt is justified.  None of my previous conversions have been 
quite this complex.  The only conversion I have heard of that might have
been hairier was that of Blender, which was done by other people 
using my tools.

But as the size and complexity of the repo goes up, so does the value
of in-band references actually working.  Emacs is an exceptionally *bad*
case for relying solely on an external reference map, not an exceptionally
good one.

> I'd appreciate if you posted the final list of the references, when
> you are finished with it, so we could have some QA.

Here is the current list. It is not final because I expect to resolve
at least a few more  of these, and it is still possible more fossil
references could turn up in odd places.

ChangeLog:
        revno 108687 -> 2012-06-22T21:17:address@hidden
        revno:105007 -> 2011-07-07T04:21:address@hidden
        r112148 -> 2013-03-26T22:08:address@hidden
        revno:108936 -> 2012-07-07T10:34:address@hidden
        revision 106831 -> 2012-01-10T08:27:address@hidden
        revision 1.59
        CVS-1.61
        1.61 in CVS
        revno:106608 -> 2011-12-04T17:13:address@hidden
        revno 100789 -> 2010-07-12T05:26:address@hidden
        rev. 110325 -> 2012-10-01T18:10:address@hidden
        r115470 -> 2013-12-11T19:01:address@hidden
        of 2012-12-20 (r111276) -> 2012-12-20T11:15:address@hidden
        2013-12-11 (r115470) -> 2013-12-11T19:01:address@hidden
        revno:114543 -> 2013-10-07T01:28:address@hidden
        revno:113793 -> 2013-08-11T00:07:address@hidden
        revno:113117 -> 2013-06-21T12:24:address@hidden
        r114834 -> 2013-10-29T02:50:address@hidden
        revno:113431 -> 2013-07-16T11:41:address@hidden
        revno:113147 -> 2013-06-23T20:29:address@hidden
        revno 101897 -> 2010-10-10T14:43:address@hidden
        revno 101876 -> 2010-10-09T18:31:address@hidden
        revno 100306 -> 2010-05-15T21:21:address@hidden
        revno 108687 -> 2012-06-22T21:17:address@hidden
        revision 114614 (commit of 2013-10-10) -> 
2013-10-10T19:15:address@hidden
        revno:113431 -> 2013-07-16T11:41:address@hidden
        revno 101949 -> 2010-10-13T14:50:address@hidden
        revno:103013 -> 2011-01-28T22:12:address@hidden
        rev 102609 -> 2010-12-08T08:09:address@hidden
        revno 101688 -> 2010-09-30T02:53:address@hidden
        revno 101459 -> 2010-09-17T13:30:address@hidden
        revnos 101381 -> 2010-09-08T14:42:address@hidden
        101422 -> 2010-09-13T15:17:address@hidden
        rev 100010 -> 2010-04-23T16:26:address@hidden
        revno:109911 -> 2012-09-07T04:15:address@hidden
        109621 -> 2012-08-15T03:33:address@hidden
        revno:88805 -> 2008-06-21T01:38:address@hidden
        revno:88864 -> 2008-06-22T13:57:address@hidden
        revno:89810 -> 2008-07-31T05:33:address@hidden
        revision 106664 -> 2011-12-11T14:49:address@hidden
        revno:105285 -> 2011-07-19T15:01:address@hidden
        revno:104787 (2011-06-30) -> 2011-06-30T01:09:address@hidden
        revno:104988 (2011-07-06) -> 2011-07-06T15:49:address@hidden
        revno:101730 (2010-10-02) -> 2010-10-02T13:21:address@hidden
        revno:103877 (2011-04-09) -> 2011-04-09T20:28:address@hidden
        revno:99634.2.463 (2010-10-09) -> 2010-10-09T04:09:address@hidden
        revno:101913 -> 2010-10-11T23:57:address@hidden
        revno 95090 dated 2009-03-06 -> 2009-03-06T07:51:address@hidden
        revno 101757 -> 2010-10-03T13:59:address@hidden
        revno 82799 (2007-11-30) -> 2007-11-30T13:57:address@hidden
        2010-07-29 (revno 100939) -> 2010-07-29T16:49:address@hidden
        revno 100928 -> 2010-07-29T03:25:address@hidden
        revnos 100982 -> 2010-08-05T23:15:address@hidden
        100984 -> 2010-08-05T23:34:address@hidden
        revno 99854.1.6 -> 2010-04-17T12:33:address@hidden
        revno 99950 -> 2010-04-20T13:31:address@hidden
        revno:100708 -> 2010-07-04T07:50:address@hidden
        revno:110851 -> 2012-11-09T04:10:address@hidden
        revision 1.1 -> the initial version
        cvs-1.12.1
        Revision 1.694 -> 2004-05-20T23:29:address@hidden
        revno 108687 -> 2012-06-22T21:17:address@hidden
        revno:108521 -> 2012-06-08T08:44:address@hidden
        revno:108341 -> 2012-05-22T16:20:address@hidden
        2011-08-30 (revision 105619) -> 2011-08-30T17:32:address@hidden
        2011-08-30 (revision 105619) -> 2011-08-30T17:32:address@hidden
        revision 84777 on 2008-02-22 -> 2008-02-22T17:42:address@hidden
        revno:102982 (2011-01-26) -> 2011-01-26T20:02:address@hidden
        revision 104625 -> 2011-06-18T18:49:address@hidden
        revision 104134 -> 2011-05-06T07:13:address@hidden
        revno:20537 (1998-01-01) -> 1998-01-01T02:27:address@hidden
        revno:87605 (2008-05-14) -> 2008-05-14T01:40:address@hidden
        revno:50135 (2003-03-16) -> 2003-03-16T20:45:address@hidden
        revno:87605 (2008-05-14) -> 2008-05-14T01:40:address@hidden
        revno:34925 (2000-12-29) -> 2000-12-29T14:24:address@hidden
        revno:20537 (1998-01-01) -> 1998-01-01T02:27:address@hidden
        revno:25013 (1999-07-21) -> 1999-07-21T21:43:address@hidden
        revno:43563.1.17 (2002-03-01) -> 2002-03-01T01:17:address@hidden
        revno:84043 (2008-02-1) -> 2008-02-01T16:01:address@hidden
        revno:25356 (1999-08-21) -> 1999-08-21T19:30:address@hidden
        revno:20870 (1998-02-08) -> 1998-02-08T21:33:address@hidden
        revno:36704 (2001-03-09) -> 2001-03-09T18:41:address@hidden
        revno:32591 (2000-10-17) -> 2000-10-17T16:08:address@hidden
        revno:25013 (1999-07-21) -> 1999-07-21T21:43:address@hidden
        revno:43563.1.32 (2002-03-01) -> 2002-03-01T01:17:address@hidden
        revno:14998 (1996-04-12) -> 1996-04-12T06:01:address@hidden
        revno:86854 (2008-04-19) -> 2008-04-19T19:30:address@hidden
        revno:20569 (1998-01-02) -> 1998-01-02T21:29:address@hidden
        revno 103623 -> 2011-03-11T07:24:address@hidden
        revision 1.32 of saveplace.el -> saveplace.el at 
2005-05-29T08:36:address@hidden
        revision 1.30  of saveplace.el -> saveplace.el at 
2005-04-10T23:32:address@hidden
        version 1.100 -> 2007-12-06T19:56:address@hidden
        erc.el 1.39 -> 2007-12-01T03:41:address@hidden
        revision 1.104, made on 2000-05-21 -> 2000-05-21T17:04:address@hidden
        2007-07-18 (revision 1.51)
        revision 1.90 (commitid mWoPbju3pgNotDps) -> 
2007-07-13T18:16:address@hidden
        revision 1.117 -> 2008-10-29T17:42:address@hidden
        1.85
        1.878
        1.113
        1.244
        1.34
        1.233
        rev 1.82 -> 1994-08-03T07:39:address@hidden
        1.70 (Jan 5 changes) -> 1994-01-03T07:21:address@hidden
        r99212 -> 2009-12-29T07:22:address@hidden
        rev. 110325 -> 2012-10-01T18:10:address@hidden
        revno r112320 -> 2013-04-18T00:12:address@hidden

Change comments:
        bzrs 111300 -> 2012-12-22T19:57:address@hidden
        111840 -> 2013-02-21T02:42:address@hidden
        revision 111647 -> 2013-02-01T07:23:address@hidden
        revno:11026 -> 1995-03-15T21:55:address@hidden
        revno:88864 -> 2008-06-22T13:57:address@hidden
        revno:88805 -> 2008-06-21T01:38:address@hidden
        revno:89810 -> 2008-07-31T05:33:address@hidden
        revision 10835 -> 1995-02-25T20:57:address@hidden
        revision 106726 -> 2011-12-23T14:51:address@hidden
        revision 87208 -> 2008-05-02T07:12:address@hidden
        revision 84777 on 2008-02-22 -> 2008-02-22T17:42:address@hidden
        revno:99634.2.463 (2010-10-09) -> 2010-10-09T04:09:address@hidden
        revno:101913 (2010-10-12). -> 2010-10-11T23:57:address@hidden
        revno:20537 (1998-01-01) -> 1998-01-01T02:27:address@hidden
        revno:87605 (2008-05-14) -> 2008-05-14T01:40:address@hidden
        revno:87605 (2008-05-14) -> 2008-05-14T01:40:address@hidden
        revno:34925 (2000-12-29) -> 2000-12-29T14:24:address@hidden
        revno:20537 (1998-01-01) -> 1998-01-01T02:27:address@hidden
        revno:25013 (1999-07-21) -> 1999-07-21T21:43:address@hidden
        revno:43563.1.16 (2002-03-01) -> 2002-03-01T01:16:address@hidden
        revno:84043 (2008-02-1) -> 2008-02-01T16:01:address@hidden
        revno:20870 (1998-02-08) -> 1998-02-08T21:33:address@hidden
        revno:36704 (2001-03-09) -> 2001-03-09T18:41:address@hidden
        revno:32591 (2000-10-17) -> 2000-10-17T16:08:address@hidden
        revno:25356 (1999-08-21) -> 1999-08-21T19:30:address@hidden
        revno:14998 (1996-04-12) -> 1996-04-12T06:01:address@hidden
        revno:86854 (2008-04-19) -> 2008-04-19T19:30:address@hidden
        revno:20569 (1998-01-02) -> 1998-01-02T21:29:address@hidden
        r100577 -> 2010-06-10T12:56:address@hidden
        CVS rev 1.49, 2001-09-12
        CVS rev 1.47, 2003/01/27
        CVS r1.35
        revno 95090 dated 2009-03-06 -> 2009-03-06T07:51:address@hidden
        2005-02-15 (revno 60055) -> 2005-02-15T23:19:address@hidden
        r111320 -> 2012-12-24T15:56:address@hidden
        revno 99854.1.6 -> 2010-04-17T12:33:address@hidden
        revno 99950 -> 2010-04-20T13:31:address@hidden
        revision 99649 -> 2010-03-12T16:34:address@hidden
        rev 99649 -> 2010-03-12T16:34:address@hidden
        rev 99553 -> 2010-02-24T22:07:address@hidden
        revno 99212 -> 2009-12-29T07:22:address@hidden
        revision 94343 -> 2009-01-30T13:06:address@hidden
        revision 1.32 -> 2005-05-29T08:36:address@hidden
        revision 1.30 -> 2005-04-10T23:32:address@hidden
        version 1.100 -> 2007-12-06T19:56:address@hidden
        r1.135 -> 2009-10-10T21:48:address@hidden
        rev 1.114
        1.878
        revision 1.117 -> 2008-10-29T17:42:address@hidden
        rev 1.14395
        revision 1.56
        3.85
        1.17
        revision 1.69
        revision 1.1 -> initial revision
        rev 1.5
        revisions 1.40
        1.41
        1.39-> 2007-12-01T03:41:address@hidden
        revision 1.104
        revision 1.51
        revision 1.90 (commitid mWoPbju3pgNotDps) -> 
2007-07-13T18:16:address@hidden
        revision 1.1509
        revision 7.8
        CVS v1.12.8 and 1.12.9
        cvs-1.12.1
        1.103
        HEAD (1.72)
        v1.275
        1.58
        v1.5046
        v1.5039
        rev 1.82 -> 1994-08-03T07:39:address@hidden
        rev. 1.761
        revision 1.3831
        1.3832
        revision 1.12
        revision 1.13
        revision 1.14
        revision 1.15

The ChangeLog references are not attributed to individual files
because they moved as the files rotated.

Some of the remaining CVS references cannot be reseolved within the Emacs
history; they actually point to other projects.  One particularly fertile
source of these, which I think accounts for this group

        1.85
        1.878
        1.113
        1.244
        1.34
        1.233

in ChangeLogs, is the CVS history of the erc files before they were merged
into Emacs.

> The problem is not the size of the repository alone.  The problem is
> that different portions of a single changeset were committed many
> revisions apart.  And I don't even understand (and you didn't explain)
> how will you handle the situation I described above, where a single
> commit checked in ChangeLog changes for several unrelated commits in
> the same directory.  Which commit clique will you assign the ChangeLog
> commit to?  The devil is in the details, but you haven't provided any
> details about your plans in this matter.  Would you please do that?

I see we are using the term "changeset" slightly differently, and this has
produced some confusion.

The uncoalesced changesets I am looking for are not defined by "all
share the same ChangeLog entry" (though usually that is the case).
You are quite right that attempting to coalesce all of those would
produce perverse results in cases of several unrelated commits.

Fortunately, most of the unresolved cliques are not like this.  The
usual case, in this conversion as in others I've seen (such as groff)
is that an unresolved clique consists of one or several closely
related changes and one ChangeLog modification, without intervening
commits by others.  This is what I think of as a changeset.

Normally tools such as parsecvs collect these into single changesets.  
But these converters have a maximum coalescence window.  If such a span
of commits took place over a longer period of time than the window, it
won't be coalesced. 

The problem is that the default time windows on these converters are
set small in order to avoid false-positive matches.  Experience has
taught me that this is a mostly imaginary problem; the window would
have been better set to infinity in almost every case I have seen.

The result of a too-small commit window is that some genuine changesets
(not the edge case you are pointing at) do not get coalesced. In your
edge case, the least bad thing to do is accept that the ChangeLog entry
must remain its own changeset; sometimes you can get partial 
coalescence in the file changes.

When there is CVS in the history, a standard part of my cleanup is
basically to run a coalescence pass with a very long window.
Semi-automating this operation, so it (a) doesn't have to be done
manually, but (b) is easily checked by skilled human judgment, was
one of the purposes for which I originally wrote reposurgeon.

Fortunately the bad cases aren't actually very common.

> > > > 5. Unconverted .bzrignores (and possibly .cvsignores) in the history.
> > > 
> > > Why is that a problem?
> > 
> > See "seamless history browsing".
> 
> Sorry, I don't understand.  Please elaborate: what is the relation
> between these ignore files and history browsing?

In a properly done conversion, file ignores don't abruptly stop working
bevcause you browsed back past the point of conversion and what should
be .gitignore files are nmow .bzrignores or .cvsignores.

> > The way this is working is that I am building a reposurgeon script that
> > expresses a sequence of edits to Andreas's mirror. On conversion day 
> > we will apply that script once, after which everyone can re-clone and
> > go on as before.
> 
> Sorry, I don't see how this changes anything.  You are still going to
> make deep changes to the existing mirror.

Yes, for arguable values of "deep". As Paul Eggert (I think) said, I'm
after a result that is stainless steel rather than earthenware. With
ugly cracks in it.

> > > Noble goals all of them, but I'm skeptical as to whether they can be
> > > achieved in practice.  What's worse, we won't know whether some issues
> > > remained until much later.
> > 
> > I know they can be achieved in practice because I have achieved them before,
> > many times.  Most recently in the conversion of the groff history, but
> > you could check with the maintainers of NUT or Hercules or robotfindskitten
> > or Roundup as well. Or the Blender Foundation - blender is a big reposurgeon
> > conversion done by someone else.
> 
> Sorry, been there done that.  The CVS to bzr conversion also seemed
> flawless until much later.

There are several differences this time.  One of the most important is that
the state of the art has advanced.  My tools do things that would have been
impossible or impractical before they existed.  I have auditing capabilities
you would probably have to work a bit to even imagine.

As a relatively trivial example - if Stefan or some other person with
policy authority makes the call, I could reliably split elpa out into
its own repo with one short command in the reposurgeon DSL.
 
> > If we find any problems afterwards, I have the tools to fix them. Part of
> > my commitment is to do that.
> 
> I don't think any of us can in good faith give such promises.

The span of my contributions to Emacs is measures in decades.  I do not 
think you need to fear that I will vanish before this job is done.
-- 
                <a href="http://www.catb.org/~esr/";>Eric S. Raymond</a>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]