[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: generic buffer parsing cache data
From: |
Paul Pogonyshev |
Subject: |
Re: generic buffer parsing cache data |
Date: |
Sun, 1 Jul 2007 16:41:58 +0300 |
User-agent: |
KMail/1.7.2 |
martin rudalics wrote:
> > I propose to add something generic. For instance, Python mode needs to
> > know indentation level of blocks. It seems that `syntax-ppss` doesn't
> > return it at all. And adding everything that might ever be needed by
> > some XYZ mode seems counter-productive and complicates an already complex
> > function and its return value.
> >
> > I just mean that major modes can have needs beyond that suited by
> > `syntax-ppss`. And as far as I can see, they can either parse half of
> > the buffer each time they need something, or invent some ad-hoc custom
> > code for caching such data.
>
> Like `c-state-cache'. Well, `syntax-ppss' can only do whatever
> `parse-partial-sexp' does. Occasionally, that's not even sufficient for
> the Elisp case (look how `lisp-font-lock-syntactic-face-function'
> strives for detecting doc-strings). I'd appreciate if you came up with
> something more "generic" (if you just could give a clear description of
> that term).
For instance, something like this:
Function: put-cache-data key data &optional pos
Store cache DATA with given KEY in the current buffer, at position
POS (if not specified, then where point currently is.)
Function: get-cache-data key &optional pos
Return cache data associated with given KEY in the current buffer
at position POS (if not specified, then where point currently is.)
If there is no data with that KEY stored at position, or if it has
been invalidated, return nil.
Internally, Emacs core (at C level) automatically invalidates cache data
starting from X onwards when buffer text from X to Y (Y >= X) changes in
some way. Whether cache data is actively removed from internal storage,
or just somehow marked invalid is implementation detail and irrelevant for
Elisp level.
It is unclear whether changes in any text properties should lead to cache
invalidation. Probably no, at least by default.
It also makes sense to define some `anchors'. Those would be ways of
partitioning buffers into parts, where changes in one part don't cause
invalidation of cache data in other parts. For instance, in Python mode
anchors would be set wherever a toplevel block is defined, since it stops
parsing on reaching a toplevel anyway. However, this can be added later.
For instance, it is not clear when and how to remove anchors. (I.e. in
Python mode if toplevel is indented to another level, it should stop
being an anchor.)
It is required that major mode stores cache data at some logical position,
so it can later find them again. Maybe it also makes sense to add
Function: find-cache-data key &optional pos
Find and return cache data at POS (or point position) or _before
it_. Return nil if there is no (valid) cached data at pos or
anywhere before with that KEY.
However, I don't see any obvious ways of using it. As I can see, modes
should access cache data like this (in pseudocode):
mode-get-cache-data:
data = (get-cache-data mode-key)
if data is nil:
data = (mode-compute-cache-data)
(put-cache-data mode-key data)
return data
mode-compute-cache-data:
save-excursion:
travel-to-higher-level-cache-point
higher-level-data = (mode-get-cache-data)
data = (mode-compute-data-from-higher-level higher-level-data)
return data
Here `higher-level' is not the same as `previous'. For instance, in
Python mode it makes sense to compute indentation from the block this one
is nested in, not just previous block:
class X:
class Y: # <-- higher-level block for the current block
class Z:
def bla (): # <-- previos block (with cached data)
pass
def __init__(self): # <-- current block
pass
Paul
Re: generic buffer parsing cache data, Stefan Monnier, 2007/07/01