monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] branch naming conventions


From: Nathaniel Smith
Subject: Re: [Monotone-devel] branch naming conventions
Date: Sun, 30 Oct 2005 03:38:18 -0800
User-agent: Mutt/1.5.9i

On Sun, Oct 30, 2005 at 11:14:21AM +0100, Zbynek Winkler wrote:
> When everything has hash-based unique id why should branches be any 
> different? Every database would maintain mapping between the unique ids 
> and some human readable form (which could be anything - globaly unique 
> or not). That would allow easy (local) renaming of branches while not 
> forcing a naming convention.
> 
> Everything could stay the same only where now the branch name is would 
> be the guid. Should the user want to use the guid, he/she could. But a 
> database-local mapping could be created to assign names to the branches 
> (not versioned). The ui code would map the names to the guids and back.
> 
> What do you think?

This sounds like something similar to a Pet Names proposal:
  http://www.erights.org/elib/capability/pnml.html

The problem with pet names systems is that they work great if you have
control of the communications media people will be using, because then
you can seamlessly translate between different people's views of the
data, and everything stays all nice and secure and such.  But in real
life, we're building tools that are supposed to slot into a rich,
pre-existing social ecology of collaboration, and there's no way we
can do that.  So I think there's some (insufficiently recognized!)
onus on us as designers to try and build systems that will be secure,
_assuming_ our transport media is, well, people chatting over IRC or
cubical walls and on the phone and pagers and leaving each other
postits and all the crazy opportunistic stuff that real people come up
with to talk.  People are good at talking, they're going to keep
using certain patterns of communication whatever we do; let's leverage
that instead of fighting it.

The most obvious example of this is using hashes to identify
revisions; they're not _much_ harder to deal with than other schemes,
day to day, and you get this magical thing -- once you're used to
those hashes as being the names for revisions, then casually
mentioning revisions over email or IRC or whatever, just like you'd do
naturally, suddenly gives you strong end-to-end security guarantees.
(It's also, as a side-effect, guaranteed to always and simply _work_,
which is not the case with schemes that use local revision numbers...)
We think that's kinda neat...

Branch and key names are similar, but a little trickier, because they
actually need to be human-memorable.  We do something slightly
different for each:
  -- branch names are straight-out global; branch certs just contain a
     branch name as a string, and if two certs happen to have the same
     bytes in their string, then those two certs both put their
     revisions into the same branch.  This is a bit simplistic and
     bothers people a lot, since you can't rename branches, but it
     does meet the goals above.
  -- key names are a little more subtle, because the name alone is not
     enough to make a key, keys are both names and high-entropy
     crypto gunk together.  Here, we hash the key name into the key
     fingerprint, and the key fingerprint is used to record which
     key signed a given cert.  Also, each db refuses to hold multiple
     keys with the same name (and monotone will get annoyed at you if
     you try to sync with a server that has a different key with the
     same name).  The consequence of this is that in each community
     ("community" being defined as, "certs and pubkeys trickle between
     members somehow", and usually means "they all use the same
     project netsync server", though really it's a more general
     concept), everyone has to have the same name for keys, and this
     is silently ensured every time you sync.

     This is also somewhat problematic (though this hasn't come up as
     much yet, though it probably will as monotone usage grows),
     because it means that if the, say, "address@hidden" key goes bad,
     like it gets compromised or I lose the privkey or something, then
     I can't replace it, I have to make a new name.
     address@hidden is one option, but not terribly satisfying,
     esp. since I don't know how well + addresses are supported
     generally...

I know of another option that seems like it would be at least stand a
chance of working; this was proposed by Derek Scherger a while ago.
The idea is to actually name branches (or, I suppose, keys) something
uninformative, like 20 bytes of pure entropy (or, for keys, a hash of
the keypair).  Then, each db keeps a mapping ID <-> READABLE NAME.
The tricky bit is that at netsync time, we check to make sure that our
mapping is consistent with our peers mapping; consistent means, we
take all their (ID, NAME) pairs and all our (ID, NAME) pairs, union
them, and check to see if the resulting set has any duplicate IDs with
different NAMEs, or duplicate NAMEs with different IDs.  If so, it
aborts the netsync and tells you to fix up one side or the other
manually.  (This is very similar to the current 'epoch' support, in
implementation.  It could also, incidentally, replace epochs, by
re-implementing what we currently call "one branch with two different
epochs" as "two different branches with the same human-readable
name".)

Basically, the idea is that anyone can rename branches, but each
possible naming creates a little communications ghetto, where you can
only talk (using monotone) to other people who use the same names.
Since people you talk to within monotone are generally the same people
you talk to outside of monotone (over IRC or whatever), this seems
like it would work reasonably well -- especially since changing the
name of something that was already on the server would require enough
social coordination (mailing list announcements of what was going on,
etc.) that everyone should be sensitized to the issue and unlikely to
be ambiguous.

This has the advantage over schemes that actually track branch
renaming as an event that it's, well, not insanely complicated for
both the implementor and the user trying to figure out what's going
on.  OTOH, requiring separate manual intervention on the part of every
single developer might well put such a high social stigma on doing
this that it never happens in practice and thus is useless.  (Probably
more likely for keys than for branches; different solutions may be
appropriate for different situations...)

Another possible option would be to store the mapping in the
hypothetical trust database, as described in
   http://frances.vorpus.org/~njs/mt-permission.html
The trade-offs of this aren't clear to me; and there's a reason that
that trust system is still hypothetical :-).

Hrm.  This reminds me that the dumb server stuff actually has no
thought whatsoever put into epochs... that's quite bad.  It's not
immediately obvious to me how to handle these kinds of tricks in a
non-netsync transport medium.  I guess we could have epoch packets,
that 'monotone read' records, checks, and exits with an error if the
epoch conflicts; then with all of a 'monotone read' being in a
transaction, this would prevent the other stuff read in at the same
time from actually getting written.  And a similar trick would work
for the name tuples above.

Sorry, that was, err, probably more than you were expecting to have to
wade through in response to your idea :-).  Does this stuff make
sense?  What do you think?

-- Nathaniel

-- 
- Don't let your informants burn anything.
- Don't grow old.
- Be good grad students.
  -- advice of Murray B. Emeneau on the occasion of his 100th birthday

This email may be read aloud.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]