emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Overlays as an AA-tree


From: Stefan Monnier
Subject: Re: Overlays as an AA-tree
Date: Wed, 21 Sep 2016 10:52:23 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)

> What I've been looking at is replacing the current implementation of
> buffer overlays with a self balanced tree. I chose an AA-tree as that
> seemed simple enough for me to grasp :).

Could you describe how you use this tree?  An AA-tree is indexed by
a single "number", whereas overlays have two independent "numbers".
So how to use this AA-tree to implement an *interval* tree?

This should be documented in the code.  E.g. the "struct Lisp_Overlay"
should define very clearly the meaning of char_start, char_end, left,
and right (i.e. which overlays go to the left, and which go to the
right).

While looking for an answer in the text I found:

    /* This function determines where overlays are inserted in the tree.
       FIXME: I didn't think too hard about this...
    */

which makes me suspect your design might not be quite right.
Have you read https://en.wikipedia.org/wiki/Interval_tree ?

[ BTW, our convention is to put the "*/" at the end of the last line
  rather than alone on the next line.  ]

This said, reading overlay_tree_insert_internal, I have the impression
that you're implementing what wikipedia describes as an "Augmented tree"
where the basic tree is your AA-tree, indexed by the left position of
the overlays, with the `max` field being the "augmented" data, which
sounds just fine.
Is that right?

> Anyway, I basically have two questions for you experts:
> 1) Is it worth continuing down this path?

I don't see why not.

> 2) If so, what's the best way to go about something like this?  Replacing
> the whole thing at once seems very hard, but I don't really know how to
> go about replacing it little by little.

The only places you absolutely need to replace are all the places that
need to modify the tree.  There shouldn't be that many of them
(basically, make-overlay, move-overlay, delete-overlay, plus text
insertion/deletion).
Then the rest can be modified little by little.

Some tricky parts:
- because of the insertion_type, two overlays may start with end1<end2
  and finish with end1>end2 without any call to move-overlay.
- the overlay tree needs to be "weak" (i.e. we'll need to add special
  code in the GC).

> I'm attaching the diff. It is an unholy mess of ifdefs, but the meat of
> it is in overlays.[ch] and buffer.c. It is a pretty big diff, and it's
> based on master from a few months ago, so I'm not even sure it applies,

I wouldn't worry about merging (better yet: merge from master right away
and then keep doing that on a regular basis, which should be easy since
I don't think we've touched (nor will touch) this code very much).

> Well, the Lisp-visible APIs weren't really the problem. The problem was
> more in the 'internal' handling of overlays in buffer.c and in xdisp.c,
> and also the fact that I had to reimplement a lot of the logic about how
> markers are moved when the buffer changes. Speaking of which, is the
> byte position stored in a marker of any significance in an overlay?
> Otherwise I could at least get rid of those.

AFAIK, the byte-position of markers is used, but the byte-position of
overlays isn't, so you should be able to get rid of them.

Richard said:
> Since overlays have to be discarded whenever text is copied,
> the need to go through the text properties and discard those
> that are really overlays could make things substantially slower.

My original idea was to keep overlays inside the intervals tree, but
that'd be done by adding some fields to the interval struct, so you
wouldn't need to do any extra work to "discard" the overlays when
copying an interval tree: just don't copy the overlay-related fields
while copying the interval tree.


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]