monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: Support for binary files, scalability and Windo


From: Zbynek Winkler
Subject: Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Date: Tue, 20 Jan 2004 00:31:45 +0100
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031208

graydon hoare wrote:

as background (another email asked "how this stuff works"): the idea with a hashtree (a.k.a. merkle tree) is that you and I both store our entire set of "objects" (say, blocks or files or whatever) in an N-ary tree where each leaf is positioned in the tree at some unique spot (in our case the tree represents the entire space of SHA1, each branch peeling off some number of bits, and the leaf's position at the bottom of the tree is determined by content hash). each interior node's slots are the hashes of the subnodes beneath it. the hash algorithm used for interior nodes isn't necessarily related to the one pointing out to leaves, but we might as well use SHA1 again.

Let me see if I understand this right. First we have a regular 256-ary tree structure 20 levels deep (so SHA1 is 20 bytes, right?). There is really nothing unexpected about this tree (just a regular search tree). Now in addition to this we take each node in the tree and calculate a hash of its contents and place the hash to the parent (we would do this bottom up). When we have a new SHA1 that we want to add to the tree we use the tree structure to find a place for this new leaf (going down from the root) and add it to the appropriate node at the bottom. Now we have changed the node so we have to update its hash in its parent (and so on up the tree). Did I get it right? If so, it seems very elegant to me. Why haven't you done it this way in the first place? ;-)

anyways, when you and I want to sync, I send you my root node, you can see all the slots in it which are non-equal to *your* values for those slots: those are 1/N subtrees which have "a difference" in them, somewhere. so for each of those which is different, you send me *your* subtree node, and then I look at them and pick out the 1/(N*2) subtrees which contain the differences. we bounce back and forth at worst log_N(X) times where X is the size of the space we're spanning (I'm thinking a 256-ary tree over SHA1: 20 levels, so at worst 10 round trips) and exchange at worst O(D*K) bytes overhead locating the K different objects. then we just transmit the objects missing from each party's collection, having isolated them. there are some cheap hacks used to make the average case costs collapse to the load of the tree rather than its depth (so, more likely say 3 or 4 round trips), but even in the worst case it's very efficient, and it's easy to pipeline.

What is "D" in the bytes overhead?

note that this is a very general algorithm for synchronizing two collections of arbitrary things. the "tree" here has nothing to do with a filesystem tree or the xdelta/suffix-tree storage representation for a file. those are orthogonal.

Yes, so it appears. Judging from the description the algo handles deletes from the collection as well so it should work. I believe that deletes are needed for the table 'files' where only the newest versions of files are stored, aren't they? Or your term 'collection' does not equal to 'table'?

Zbynek

--
http://zw.matfyz.cz/     http://robotika.cz/
Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic





reply via email to

[Prev in Thread] Current Thread [Next in Thread]