monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] announcing preliminary "dumb server" support for mo


From: Zbynek Winkler
Subject: Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone
Date: Fri, 14 Oct 2005 16:37:04 +0200
User-agent: Debian Thunderbird 1.0.2 (X11/20050602)

Nathaniel Smith wrote:

It always exports the whole database.  This could be made smarter, I
guess, but it certainly didn't seem worth the effort for the first
pass.  One of many possible improvements for someone to make, if they
want :-).

:-) Anyway, does it have to export the database at all? Maybe it could build the MerkleDir directly from the database...? But then maybe we would be reimplementing netsync in python... ;)
I was actually considering the possibility of having some
merkle-related automate commands -- something like
 $ monotone automate merkle_hash certs 0035 net.venge.monotone*
which would give the merkle hash of the 0035-prefixed certs that would
be included in a netsync of net.venge.monotone*.  Calculating merkle
trees is a little expensive, [snip]

Yeah, I've noticed that too. I suppose the trees are rebuilt for each push/pull because different branches can be included in it every time, right? Would it make sense to precompute the merkle trees for each branch and store it in the db? It could get updated in a lazy manner - when adding new things to the branch, delete the hash indexes that would need to be updated; on the next sync rebuild the all the missing hash indexes...

so again you'd want some sort of in-process caching plus "automate stdio", but this is doable and it
would make it easy to write funky syncers, like say writing a client
script and a cgi to allow syncing a monotone db directly against a
viewmtn install, say.
I would like that :)

However, this doesn't actually solve any of the problems that
merkle_dir.py solves, since all its heavy lifting has to do with the
other end -- how do you deal with a simple remote filesystem to make
it possible to efficiently push and pull.  It isn't trivial to
integrate this with something like the above 'automate merkle_hash'
idea, because then you'd have to make sure you could efficiently
calculate hashes _in the same way monotone does_, and that takes a bit
of thought.

Oh, however however, you're actually quite right -- you could do
something very conceptually simple in just monotone-dumb, which is
implement a class that acts like a MerkleDir, but that is constructed
in memory directly off a monotone database.  Basically, you'd iterate
over the db like do_export does, but instead of actually fetching
stuff, you could just keep track of which ids exist, build HASHES
files in memory, and use them to sync.  Then you pull stuff out of the
db as necessary, when it turns out you want to send it to the remote
side.  That'd be neat.
Yes, that would. I might give it a shot... How hard would it be to implement something like the above (including the precomputed per-branch merkle trees) outside of monotone first, let's say as a python wrapper? When adding stuff to a branch it would invalidate part of the stored cache and recompute it on next sync. One merkle dir would store only one branch...?

The only real problem I see is that at the moment monotone doesn't
expose any sort of "handle" on certs that can be used as a unique id
(even though internally it does keep per-cert hashes).  Every item in
the merkle_dir needs to have a unique id; so to generate this, ATM
dumb.py just hashes the text of the cert packet.  This works, but it
causes a problem for the above scheme, because if we discover that
we need the chunk whose id is 0123, and that refers to a cert, we
can't simply ask monotone for the cert 0123 -- we have to keep some
sort of reverse lookup table.  The same problem doesn't arise for
revisions, because dumb.py just uses the revision id, which we _can_
ask for on the fly.

I think this is a good argument to add cert ids to monotone; it's
something that's been in the back of my mind as a good idea for a long
time anyway.  This is a good use case for them.
Yes, I agree.

Zbynek

--
http://zw.matfyz.cz/     http://robotika.cz/
Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic





reply via email to

[Prev in Thread] Current Thread [Next in Thread]