emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bug #25608 and the comment-cache branch


From: Dmitry Gutov
Subject: Re: Bug #25608 and the comment-cache branch
Date: Wed, 22 Feb 2017 04:25:53 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.0

On 14.02.2017 18:38, Stefan Monnier wrote:

Like all the sexp movement functions, `forward-comment` is allowed to
assume that the starting position is outside of comments/strings, so it
doesn't need to consult the cache to see if it's within a string.

I see, thanks. And I think that means that, ideally, it would work without the caller having to adjust the syntax visibility bounds, or the like, as long as the syntax table is correct and the beginning (or the end) of the currently navigated comment is within view.

In the case we do scan forward (e.g. the case where we end up using
parse-partial-sexp (or syntax-ppss in my patch)), we actually manually
re-introduce that behavior: if the forward parse says that the
end-comment-marker in inside a string (or inside another comment), we
re-parse from the beginning of that string (or comment) to try and see
if that end-comment-marker could be considered to close a comment nested
within the string (or the other comment).

That indeed sounds complex.

Calling syntax-ppss every time back_comment is invoked would probably
result in bad performance currently: when parsing backward
(e.g. backward-sexp), the syntax-ppss-last optimization is ineffective,
so we'd fallback on syntax-ppss-cache which ends up scanning on the
average syntax-ppss-max-span/2 (i.e. 10K) chars.  When \n is a comment
ender (i.e. in most programming language modes), it would imply
a forward scan of 10K for every line.

You're probably right, but I wonder what the benchmarks would say.

(parse-partial-sexp 1 10000) takes 0.0005 seconds here, so it'd still require some intensive usage to show up on user's radar.

Previously, we started from the beginning of the current defun, as delineated by an open paren in the first column, right?

I've seen function definitions longer than 10000 chars.

IOW, for such an approach to work, we'd have to rework syntax-ppss to be
faster when scanning backward (e.g. reduce syntax-ppss-max-span, which
would have other repercussions).

Perhaps we could use the "generic comment bounds" syntax-table property to delineate such difficult comments. If that idea sounds similar to comment-cache, that is no accident.

But we should try to limit the incompatibility with mixed modes by only caching the beginnings of comments which contain strings, nested comments, etc. Better suggestion welcome (use a tree data structure instead of in-buffer text-properties?).

I've only recently come to the realization that our usage of the syntax-table text property has the same general incompatibility with mixed mode buffers as comment-cache does. The only reasons why it doesn't show as much is because we use them relatively rarely. But we couldn't, for instance, apply a "generic string" syntax to some literal in a subregion that is inside a "generic string" belonging to the primary major mode. Not sure what to do about that.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]