monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Call for testing: nvm.experiment.no-split-path


From: Zack Weinberg
Subject: Re: [Monotone-devel] Call for testing: nvm.experiment.no-split-path
Date: Wed, 27 Jun 2007 17:03:49 -0700

On 6/22/07, Derek Scherger <address@hidden> wrote:
On my core2 duo box regenerate_caches is about 4 seconds slower with
your changes. Not sure why it would be faster on my pentium-m laptop.

I still like your change though and it doesn't seem to be causing any
major slowdowns.

I've made a few more changes and got all the speed back; I'm going to
go ahead and merge it.

I'm no so sure that immutable_string is helping either. For example I
did see immutable_string::get showing up high in a few oprofiles a while
ago. I was fiddling with an UNLIKELY for the null case but didn't get
far enough with that to decide if it was good or not. It's not showing
up in gprof profiles on the core2 box at the moment though.

I'm actually more bothered by the symtab stuff than the
immutable_string stuff.  One of the things I did to get all of the
speed back, above, was make path_component not be a vocab item in any
way; get_node creates lots and lots of short-lived path_components
while walking down the directory nodes from the roster root, and
making them live in a hash table is just a waste of time.  This cut
*seven billion clock cycles* off the regenerate_caches runtime (as
measured by callgrind).

We're still copying each path_component, sadly.  The ideal thing would
be to use a pointer+length into the original file_path to index
->children, but I have no idea how to do that with C++ maps.

The profiles I have do seem to be pretty consistent in showing the
dir_map lookups at or near the top at around 8-10% of the total time. I
think Tim changed this to a "hybrid_map" but it seems like there still
may be room for improvement there.

The hybrid_map is for roster_t::nodes, not for dir_t::children.
Having O(1) lookup in dir_t::children would definitely help, but I'm
not sure a hybrid_map is the right way to do it (to be honest though,
I don't know what the right way *is* ... maybe, like, steal the
directory htrees from ext3fs).

There are other, higher-level changes we might should do first,
though.  For instance, for many situations (my regenerate_caches test
is definitely abnormal) we pair *every* get_node with a has_node,
effectively doing the lookup twice, so that we can N() instead of I()
if it's not found.  It would be better if we could write, say,

 node_t n;
 N(maybe_get_node(path, n),
   F("file '%s' is not known to the workspace manifest, try '%s add'")
   % path % ui.prog_name);
 // ... do stuff with n

And I'm looking into more drastic changes to deal with the ridiculous
amount of time we spend in the dynamic_cast<> implementation (nearly
seven percent of total runtime for regenerate_caches).

zw




reply via email to

[Prev in Thread] Current Thread [Next in Thread]