gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] [BUG] FEATURE PLANS: "perfect" summary deltas


From: John Meinel
Subject: Re: [Gnu-arch-users] [BUG] FEATURE PLANS: "perfect" summary deltas
Date: Fri, 09 Jul 2004 09:56:30 -0500
User-agent: Mozilla Thunderbird 0.7 (Windows/20040616)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Matthew Dempsky wrote:

| Tom Lord <address@hidden> writes:
|
| As to the builder algorithm, why not simply use Dijkstra's shortest
| path algorithm using patch size as edge weight and an edge anywhere a
| patch or patch summary exists?  (Also a starting node representing an
| empty directory, so zero-weight edges between it and revlib'd or
| pristine revisions and cacherev-weight edges between it and
| cacherevs'd revisions.)  It would be general enough to still meet
| Aaron's hypothetical smart server, while simple enough to fulfill
| Tom's no-more-complicated-than-necessary goal.

It's nice to see that the summary deltas would work well. I think you
picked a good algorithm for computing the shortest path, my concern,
though is that you may not want to get complete knowledge.

At least to my understanding, because of the round-trip behavior of the
http access, you have to do a lot of slow queries to get a full
inventory of what patch files you are interested in. If the sizes aren't
huge, it might be faster to have a less intelligent algorithm that
doesn't search so hard.

I don't really know http, but since you need the .listing files, I'm
guessing it's likely that you can't just say: "Give me the name and size
of every file below this directory". Sort of like:
find "path/" -type f -print0 | xargs -0 ls -sk

If you are using .listing files can you even get the file size? I know
it isn't in the .listing, but maybe there is an http stat. My guess is
you actually have to initiate a download before you get the http header
"Content-Length:"

If you can't get a size, then I would say you could approximate it with
patches are 1/8, summaries are 1/4, and cacherev's are 1, or something
like that. It's not perfect, but it might be a good approximation. We
could fine-tune these parameters with actual numbers anyway.

If you can get a size, but it is a little expensive, what if you used
(what I think of as) graph search instead of Dijkstra? The basic
difference being in graph search you don't touch a node until it is
potentially more favorable than another path. The Dijkstra I know looks
at every node, and just evaluates them with local operations.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFA7rIeJdeBCYSNAAMRAgWlAKC/CQZJYeCVjtYaU4xah92TKYlIGQCfVe95
KnOffmp8gS6nMKkQwXUJUic=
=jhZb
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]