|
From: | Daniel Colascione |
Subject: | Re: PL support (was: Drop the Copyright Assignment requirement for Emacs) |
Date: | Sat, 09 May 2020 08:50:45 -0700 |
User-agent: | AquaMail/1.24.0-1585 (build: 102400006) |
On May 9, 2020 8:35:25 AM Eli Zaretskii <address@hidden> wrote:
From: João Távora <address@hidden> Date: Sat, 9 May 2020 16:25:36 +0100 Cc: Eli Zaretskii <address@hidden>, Stefan Monnier <address@hidden>, emacs-devel <address@hidden> I think Eli has indicated that LSP support in the core is desirable at some pointNot only desirable: long overdue. Emacs must learn to use the latest technologies of supporting programming languages based on real parsing, because the time when it could be done with regular expressions and similar techniques has come and gone. We cannot enable significant new IDE-like features if we don't acquire these technologies. Please someone start working on this ASAP. We sorely need that, just look at the recent discussions on Reddit that underline these deficiencies in Emacs.
It's a hard problem. A mode based on a real parser must be fast, incremental, and robust against transient errors that arise in the normal course of editing. We'd also want the ability to parse complex languages (far beyond LALR) and incorporate out-of-band information in order to resolve semantic ambiguities --- e.g. the C++ template problem. On top of all that, the parser would need to be highly malleable to make it easy to adjust to slight differences in dialect as well as deal with multiple languages in a buffer.
I've given the subject a bit of thought. One line of investigation consisted of doing GLR parses of each buffer, producing parse forests, and disambiguating the parse forests using semantic rules. The nice thing about GLR parsers is that they're closed under composition, so you can build arbitrary multi-modes. But this approach is, well, complicated. I'm not sure whether anyone's done it, even in research. (But I haven't searched the literature lately.) Robustness in the face of errors could be helped by something like http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.6885&rep=rep1&type=pdf
One nice thing about using formal grammars is that you can analyze and transform them, e.g., using the autobracket transform in the previous paper, or even doing automatic semantic autocompletion.
A much simpler approach I've also had in mind is providing a C-assisted incremental parser combinators facility to Lisp --- something like the venerable pyparsing. Parser combinators make it pretty easy to incorporate error recovery rules, and the C code can use approaches like current syntax-ppss to keep the parse up to date and maintain, cheaply, an AST.
[Prev in Thread] | Current Thread | [Next in Thread] |