[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] announcing preliminary "dumb server" support for mo
From: |
Zbynek Winkler |
Subject: |
Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone |
Date: |
Thu, 13 Oct 2005 00:15:50 +0200 |
User-agent: |
Debian Thunderbird 1.0.2 (X11/20050602) |
Nathaniel Smith wrote:
On Wed, Oct 12, 2005 at 09:34:26PM +0200, Zbynek Winkler wrote:
Hmm. I quite didn't get the picture until I tried it ;). I didn't have
the patience to wait for the local do_export to finish on the monotone
database... But the speed seems to be (unfortunately) comparable to the
verification of the incomming changesets when doing regular pull. BTW:
No, different problem entirely -- do_export is currently quadratic in
the history length, mostly because it uses a separate invocation of
monotone to request each manifest delta, and since monotone still does
unbounded delta chaining, it takes linear time to retrieve an
arbitrary manifest. (This also applies to files, but files tend to
have much shallower histories than the tree as a whole, so it doesn't
matter there as much.)
I am not that familiar with monotone codebase, so please forgive the
question - but why does this cause do_export to be quadratic? Does it
actually retrieve arbitrary manifests? Shouldn't the request for
manifest delta be constant if that is the way it is stored in the db?
I've gone over merkle_dir.py and I believe it provides append-only data
structure (file called DATA) for arbitrary chunks identified by id
(mostly the hash of the chunk). I see some logic in do_export that
checks if old_something is already in the merkle_dir and if not, the
whole thing is put it - otherwise a delta of old_something and
new_something is requested. Is this true? Doesn't this requested delta
correspond directly to the delta stored in the monotone db? [BTW: For
some reason I thought monotone does reverse delta...?] What I do not
understand is how on earth can we have a 'thing' that we do not have the
corresponding 'old_thing' for? And where does the linear search come
from when getting the deltas?
One thing that would help a lot would be to move the packet commands
to automate (where they probably should be anyway), and then teaching
monotone.py to use 'automate stdio'. That way we're using a
persistent monotone process, and the db layer's internal caching
should be able to turn this back into a linear operation.
Hmm. I am yet to find a way (or machine) how to compile monotone in a
reasonable amount of time :(
What is the limiting factor of the verification step? Does it do some
sorting?
No, it's doing a bunch of really torturous checks of different sorts
of data inconsistencies each revision might have. "Torturous" because
our data structures were not well chosen (because when we were first
inventing this stuff, we didn't know as much as we do now). The
rosters code replaces this stuff entirely, and shouldn't suffer from
the same problems. (Instead, it will suffer from new, different
problems! Hopefully less severe, though :-).)
Then I am looking forward to the rosters! :-)
OTOH, it supports monotone's full sync semantics (multiple people can
push to the same "repo", you can have backup "repos" and sync with
them indiscriminately, etc.), and should be reasonably efficient (it
uses merkle tries to do low-overhead set synchronization). Won't be
as fast as netsync, or as flexible (whoever puts up the repo gets to
choose what branches are included, you don't get to pick on the fly
like for netsync), but might be handy for some people...
How do you pick what branches are included? Doesn't it always export the
whole database? I was surprised to find out that it does not
differentiate between files, changesets etc.
It always exports the whole database. This could be made smarter, I
guess, but it certainly didn't seem worth the effort for the first
pass. One of many possible improvements for someone to make, if they
want :-).
:-) Anyway, does it have to export the database at all? Maybe it could
build the MerkleDir directly from the database...? But then maybe we
would be reimplementing netsync in python... ;)
Zbynek
--
http://zw.matfyz.cz/ http://robotika.cz/
Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Zbynek Winkler, 2005/10/12
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Nathaniel Smith, 2005/10/12
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone,
Zbynek Winkler <=
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Nathaniel Smith, 2005/10/12
- Delta storage (was Re: [Monotone-devel] announcing preliminary "dumb server"...), Zbynek Winkler, 2005/10/14
- [Monotone-devel] Re: Delta storage, Bruce Stephens, 2005/10/14
- [Monotone-devel] Re: Delta storage, Lapo Luchini, 2005/10/14
- Re: [Monotone-devel] Re: Delta storage, Zbynek Winkler, 2005/10/14
- Re: [Monotone-devel] Re: Delta storage, Nathaniel Smith, 2005/10/14
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Zbynek Winkler, 2005/10/14
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Nathaniel Smith, 2005/10/14
- Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Zbynek Winkler, 2005/10/14
Re: [Monotone-devel] announcing preliminary "dumb server" support for monotone, Matthew Gregan, 2005/10/12