monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] README library list and log command fixes


From: Nathaniel Smith
Subject: Re: [Monotone-devel] README library list and log command fixes
Date: Sun, 19 Oct 2003 19:48:26 -0700
User-agent: Mutt/1.5.4i

On Sun, Oct 19, 2003 at 05:41:44PM -0400, graydon hoare wrote:
> Nathaniel Smith <address@hidden> writes:
> 
> > I did notice an interesting quirk just now, fetching from Matt's
> > depot; I fetched just from it, without doing a full fetch.  And
> > because I hadn't fetched from www.off.net for a few days, I didn't
> > have 5c7614b9a281dade7383a1af9cc550367f806157 in my depot.  Matt's
> > changes were based off of 5c7614b9a281dade7383a1af9cc550367f806157, so
> > I ended up with a fragmented revision tree.
> 
> by "a fragmented revision tree", do you mean:
> 
>  1. you received file or manifest deltas you literally could not apply?
>  
>  or
> 
>  2. you received a manifest with an ancestry cert pointing to a parent
>     you did not yet have (and which subsequently resolved itself when
>     you next fetched from www.off.net)?
> 
> the former would be a bug. if so please file it. monotone is supposed
> to see that there is no manifest on matt's depot, so it should post
> full versions. the latter is something I don't think we can reasonably
> defend against. if you don't have some bit of history, you don't have
> it.
> 
> anyways, if it's just this later case #2, I don't think it'll *harm*
> you in any way other than making merges you do, between my head and
> matts, degrade from 3-way to 2-way. and then, again, only until you
> fetch from www.off.net to "fill in the blanks". 

It is just the latter case (though I guess it could be the former case
too, I didn't try fetching into an empty repository).

In general, we can't rely on fetching from www.off.net to fill in the
blanks; he could just as well promoted his version from a private
repository or something.

> the only other option I can think of is to have "starting a fresh
> depot" involve posting all the way back to the beginning of history as
> you know it. is that a good idea? I don't particularly like it -- it
> means that someone trying to set up a depot with just a couple changes
> might need to dump many megs of redundant packets on anyone pulling
> from them -- but perhaps it is.

Yeah, that's pretty obnoxious.  And unlike the case below, where
there's an earlier version in the same depot that we can stop at, we
really do have to make every cert possible if we want to be sure to
get ancestry.  blagga.

Though, question: what does happen with the solution below if I have

 <stuff> -> A1 --> B1 -> B2
              \         /
               `-> A2 -'

and I commit B2 to my depot "B"?  we do a cert against B1, obviously,
and a cert against A2, obviously... but do we then follow A2's
ancestry up until we reach another revision in B?  What if none of
<stuff> is in B at all; it seems like we can easily end up "posting
all the way back to the beginning of history as [we] know it" after
all.

> > This is related to the problems we solved earlier by using multiple
> > redundant ancestry certs (though now that I think about it, I'm not
> > entirely sure the solution always works when there are more than two
> > depots and someone is fetching from some subset: say you have A1 -> B1
> > -> C1 -> A2, and I am only fetching from depots A and B, won't I end
> > up seeing A1 -> A2, A1 -> B1?). 
> 
> I'm inclined to treat parenthetical concerns like this from you very
> seriously now :)

Heh.

> having thought this over a bit, I think you're once again spot on: the
> queue_edge_for_target_ancestor function ought to queue not just the
> edge (deltas + cert) from ancestor -> new, but also all the ancestry
> certs along all the paths from ancestor -> new. in most cases this
> will be exactly the same as the one ancestry cert, no harm done. but
> in pathological cases it'll keep a 3rd party monitoring an incomplete
> set of depots from seeing a fork where there isn't one.

Just to check -- the LCA algorithm knows to ignore these redundant
edges, right?  I don't have time to come up with an example now, but
it seems likely they could wreak some havoc with the way they shorten
paths.  Or can that not happen?

Should we be worried about the transitive trust issues here?  If I
automatically and uncritically create all these certs based on other
people's ancestry certs, is that bad?  (In particular, if someone does
insert a malicious ancestry cert, could we end up creating a lot of
other ancestry certs that are equally bad, but signed by good keys,
before the malicious cert is detected and removed?)

A possible solution to all these problems would be to, instead of
trying to come up with clever ways to re-cert things, simply pass
around ancestry certs promisciously; upload full ancestry cert graphs
and accept that the actual contents of any given revision may not be
available.  (To bound the amount of space used in clean depots, you
could use heuristics like "make sure that all ancestry up to 500
revisions deep is available"; an ancestry cert is ~300 bytes, so this
would give a maximum overhead of ~150k, which isn't _too_ obnoxious.)
Tracking all the ancestry available in a given depot in order to
figure this out might be pretty annoying, though.

This solution also raises trust issues, because without any
re-certing, you have to trust each and every key or parts of the
ancestry graph become unavailable; this is the other advantage of the
current re-certing mechanism, that you can get a coherent view of
history by looking at only a single depot _and only trusting people
with commit access to that depot_.  (Though this doesn't help much if
your depot is a world-writeable NNTP group, which I gather is supposed
to be supported.)  There are a few ways to work around this; a simple
one would be to have bots or people that do re-certing by hand (e.g.,
there might be a address@hidden cert that automatically stamped-off on
any address@hidden cert, after checking that foo was on its
keyring).  Another solution, possibly useful anyway for other things,
would be to support "levels of trust" for keys, so that I could say
"trust all these keys for ancestry information, but don't update to a
version unless it's been signed off on by a key in this strict
subset".

> (though, it's not as critical as the criss-cross merge issue; I think
>  in this case the 3rd party could probably 3-way merge the problem
>  away. it's still annoying that they might have to do so..)

Or else they could just figure out what has happened and add a cert to
fix things up.  Maybe I'm underestimating the degree to which this is
a reasonable thing to expect.

brainstormingly yrs,
-- Nathaniel

-- 
"Of course, the entire effort is to put oneself
 Outside the ordinary range
 Of what are called statistics."
  -- Stephan Spender




reply via email to

[Prev in Thread] Current Thread [Next in Thread]