emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacs-diffs] master 064701d 2/3: Increase the obarray size.


From: Ken Raeburn
Subject: Re: [Emacs-diffs] master 064701d 2/3: Increase the obarray size.
Date: Sat, 31 Dec 2016 15:54:58 -0500

> On Dec 31, 2016, at 09:18, Stefan Monnier <address@hidden> wrote:
> 
>>    In a typical GNU/Linux/X11 build, we wind up with over 15k symbols by
>>    the time we've started.  The old obarray size ensured an average chain
>>    length of 10 or more.
>>    * src/lread.c (OBARRAY_SIZE): Increase to 15121.
> 
> Have you checked the performance impact?  E.g. compared to a length of 8K?

I got a slight improvement over the old setting when loading dumped.elc, not a 
huge one, but I didn’t compare against a value in the 8k neighborhood.

I’m not sure what a better test might be.  I suppose I could create a test case 
with a list of strings matching the interned symbols found in dumped.elc in 
content (depends on the configuration), number of repetitions (depends on the 
#N# hack I recently pushed to the scratch/raeburn-startup branch), and ordering 
(depends on the obarray size via mapatoms, of course!), and test *just* 
interning those strings in a new obarray preloaded with the symbols created in 
the Emacs C code, without the rest of the reader code.  But I’m not sure that 
symbol set is an optimal test; certainly, if we don’t go the big-elc route, the 
timing for that specific set of intern operations won’t be something worth 
optimizing for.  And such a test strips out the cache contention and other 
effects of reading bytes and checking for Unicode input and allocating cons 
cells and so on, for better or worse.

> I do believe 15K is better than 1.5K, but an average length of 1 sound
> like the hash array might end up being larger than the optimal size.

It’s 1 if you look at all the slots, but about 37% of the slots should remain 
empty (it’s a standard balls-in-bins question in probability), putting all the 
symbols (and all the symbol lookups) in the other 63%, so the slots actually 
used have an average length of about 1.6.

Of course, this ignores uneven distribution of the hash function over the 
actual, real-world set of symbol names, and since the symbols aren’t equally 
likely in the input, chain length is only a rough approximation for actual 
run-time cost.  We might be much more affected by where “nil” winds up in its 
symbol chain than by where “describe-mode” winds up in its chain.  Also, in the 
macOS Nextstep build, it’s more like 21k symbols to start (and ~1/4 empty 
slots, and nonempty-chain length 1.9 I think).

So, yeah, the initial utilization level is perhaps a bit low.  In my testing 
with the Linux “perf” tools, it appeared that examining symbols — actually, 
reading the string size fields from memory — was where the CPU tended to stall 
and waste most of its cycles, so I was aiming to keep the chains short, at a 
bit of a cost in memory size and cache contention in the obarray itself.  I’m 
happy to run some more tests and look for a better size, if you think it’s 
important.

Really, I think an obarray should be an opaque object able to automatically 
resize itself (hash table?) or reorganize itself (tree?), and not pretend to be 
sort of like a fixed-size array with some symbols visible and some symbols 
hiding invisibly inside it, but it doesn’t seem crucial enough to performance 
to actually do anything about it right now.

Ken


reply via email to

[Prev in Thread] Current Thread [Next in Thread]