Re: [Monotone-devel] Re: Support for binary files, scalability and Windo

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: Support for binary files, scalability and Windo

From:	Zbynek Winkler
Subject:	Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Date:	Tue, 20 Jan 2004 00:31:45 +0100
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031208

graydon hoare wrote:

as background (another email asked "how this stuff works"): the ideawith a hashtree (a.k.a. merkle tree) is that you and I both store ourentire set of "objects" (say, blocks or files or whatever) in anN-ary tree where each leaf is positioned in the tree at some uniquespot (in our case the tree represents the entire space of SHA1, eachbranch peeling off some number of bits, and the leaf's position at thebottom of the tree is determined by content hash). each interiornode's slots are the hashes of the subnodes beneath it. the hashalgorithm used for interior nodes isn't necessarily related to the onepointing out to leaves, but we might as well use SHA1 again.

Let me see if I understand this right. First we have a regular 256-arytree structure 20 levels deep (so SHA1 is 20 bytes, right?). There isreally nothing unexpected about this tree (just a regular search tree).Now in addition to this we take each node in the tree and calculate ahash of its contents and place the hash to the parent (we would do thisbottom up). When we have a new SHA1 that we want to add to the tree weuse the tree structure to find a place for this new leaf (going downfrom the root) and add it to the appropriate node at the bottom. Now wehave changed the node so we have to update its hash in its parent (andso on up the tree). Did I get it right? If so, it seems very elegant tome. Why haven't you done it this way in the first place? ;-)

anyways, when you and I want to sync, I send you my root node, you cansee all the slots in it which are non-equal to *your* values for thoseslots: those are 1/N subtrees which have "a difference" in them,somewhere. so for each of those which is different, you send me *your*subtree node, and then I look at them and pick out the 1/(N*2)subtrees which contain the differences. we bounce back and forth atworst log_N(X) times where X is the size of the space we're spanning(I'm thinking a 256-ary tree over SHA1: 20 levels, so at worst 10round trips) and exchange at worst O(D*K) bytes overhead locating theK different objects. then we just transmit the objects missing fromeach party's collection, having isolated them. there are some cheaphacks used to make the average case costs collapse to the load of thetree rather than its depth (so, more likely say 3 or 4 round trips),but even in the worst case it's very efficient, and it's easy topipeline.


What is "D" in the bytes overhead?

note that this is a very general algorithm for synchronizing twocollections of arbitrary things. the "tree" here has nothing to dowith a filesystem tree or the xdelta/suffix-tree storagerepresentation for a file. those are orthogonal.

Yes, so it appears. Judging from the description the algo handlesdeletes from the collection as well so it should work. I believe thatdeletes are needed for the table 'files' where only the newest versionsof files are stored, aren't they? Or your term 'collection' does notequal to 'table'?


Zbynek

--
http://zw.matfyz.cz/     http://robotika.cz/
Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] Support for binary files, scalability and Windows port, Asger Ottar Alstrup, 2004/01/12
- [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/12
  - [Monotone-devel] Re: Support for binary files, scalability and Windows port, Asger Kunuk Ottar Alstrup, 2004/01/15
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Zbynek Winkler, 2004/01/15
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/16
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Ori Berger, 2004/01/16
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/17
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Nathaniel Smith, 2004/01/17
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Zbynek Winkler <=
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Ori Berger, 2004/01/18
    - Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, Zack Weinberg, 2004/01/18
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
    - [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Alstrup, 2004/01/18
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, Peter Simons, 2004/01/18
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
    - [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Ottar Alstrup, 2004/01/19
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/19
    - [Monotone-devel] RE: Support for binary files, scalability and Windows port, Asger Kunuk Ottar Alstrup, 2004/01/20
    - [Monotone-devel] Re: Support for binary files, scalability and Windows port, graydon hoare, 2004/01/20

Prev by Date: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Next by Date: [Monotone-devel] RE: Support for binary files, scalability and Windows port
Previous by thread: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Next by thread: Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Index(es):
- Date
- Thread