Re: Release plans

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Release plans

From:	Thomas Lord
Subject:	Re: Release plans
Date:	Tue, 02 Sep 2008 13:43:24 -0700
User-agent:	Thunderbird 1.5.0.5 (X11/20060808)

Stephen J. Turnbull wrote:

Thomas Lord writes:

 > of markers, but because this will be a self-adjusting tree
 > all those operations will be O(1) in the expected case where
 > changes display pretty good locality.

I would advise you not to expect that case.


That's two of you now (saying that).   Ok, then, I'll assume the
worst (which should still be "good enough").

 Experience in XEmacs
shows that some applications (the VM MUA in particular, but also the
old implementation of font-lock, dunno about jit-lock) like to run up
and down the buffer a lot.  AFAICS it's really important to have O(log
N) worst-case behavior.

Sorry I can't be much more precise about why this happens, I just know

that our algorithms that deal with extents (which we use to support
overlay-like behavior and text properties) are tuned for good locality
and lose badly in large buffers; they show up as time hogs in profiling.


I looked at (an old version of?) the XEmacs internals manual.

It says extents are implemented as a pair of gap buffers.  It also talks
about the concept of cached stacks of events -- e.g., a cache
of the list of extents at a given point from which the extents over
a nearby point can be quickly computed.

I don't know if that's current.

The gap buffer and extent-stack cache would fit your
description of "optimized for locality" and would exhibit
the (performance) failure mode you describe under
access patterns about like you describe ("[running] up and down
the buffer").   In that access pattern, you're frequently
moving lots of data across the gap and if you're jumping around
at all during this traversal then you're constantly getting
cache misses or unwanted cache hits for the extent stack.

The splay tree of gap buffers should do considerably better
with those access patterns.  That's the theory, anyway:

The new data structure has an operation equivalent
to "move the gap" (so, e.g., insertions just write to
a linear piece of memory) however gap motion
is O(log N) buffer size in terms of operations and
O(1) for the amount of text that has to move.  Marker
updates, getting the position of a marker, and getting
back a list of text properties over a given point are all
O(log N) -- O(1) with good locality.

When I say "locality" I'm using that more loosely than
it is used when talking about ordinary gap buffers.
With ordinary gap buffers, the optimized case is a
lot of action near one point at a buffer, then an
infrequent motion far away, then lots of closely
placed action.    When I say "locality" I mean to
include (a) cases where the action at a given time
forms a dense set, even if that set is scattered;  (b)
linear access patterns.     By "dense set" I mean that
programs can be simultaneously modifying, say,
5 widely separated areas of the buffer in a multiplexed
way -- every modification is probably close to a recent
modification (so the set is dense) but every modification
is also probably far from the *most* recent modification
(so the set is scattered).  By "linear access" I mean cases
where code is scanning forwards or backwards.

It's possible it's something internal to the implementation of
extents, too, but I think a word to the wise is appropriate here.


If extents are still the double gap buffer + extent stack
cache then I can see where you'd have problems like those
you describe.   In theory, this new thing does better in
those cases.

Thanks,
-t

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Release plans, (continued)

Prev by Date: Re: ctrl:nocaps enables mark set?
Next by Date: Re: Release plans
Previous by thread: Re: Release plans
Next by thread: Re: Release plans
Index(es):
- Date
- Thread