[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#74386: Tree-sitter javascript indentation
From: |
Yuan Fu |
Subject: |
bug#74386: Tree-sitter javascript indentation |
Date: |
Thu, 12 Dec 2024 21:34:29 -0800 |
> On Dec 12, 2024, at 7:34 PM, Dmitry Gutov <dmitry@gutov.dev> wrote:
>
> On 12/12/2024 07:28, Yuan Fu wrote:
>
>>> What would be our next step in this? Replacing all 'parent-bol' anchors
>>> with 'standalone-parent' across most ts modes?
>> Speaking of next step, I recently added another handy tool for languages
>> with C-like syntax: c-ts-common-baseline-indent-rule. I figured out an
>> indent logic that can work on all C-like languages and covers a wide range
>> of cases. This one rule can give you all theses indentation:
>
> Looks pretty great. I guess it depends on the grammars being to an extent
> compatible, right?
Yes, but most C-like language should be compatible. The rule relies on the
grammar to put brackets like “(“ “[“ “{“ as the first child node and last child
node of the contract that contains them, which is what grammars naturally do.
(The only exception I found is the for statement in C.) Beyond that, the rule
takes advantage of how parse tress are usually structured: when the previous
line is a sibling node of the current lines, usually you want to align the two
lines; and when you indent, the indent anchor is usually the
"standalone-parent”.
>
>> 1. Statements align to their previous sibling:
>> int main() {
>> int a = 1;
>> int b = 2; <-- Align to prev line’s sibling.
>> }
>> 2. Indents one level for blocks: function, if, for, struct, etc.
>> int main() {
>> return 0; <-- Indent one level.
>> { <-- Align to prev line’s sibling.
>> return 1; <-- Indent one level.
>> }
>> }
>> 3. Elements in parenthesis and brackets:
>> return [1, 2, 3,
>> 4, 5, 6]; <-- Align to first sibling.
>> return [
>> 1, 2, 3, <-- Indent one level (option 1).
>> 4, 5, 6, <-- Align to prev line’s sibling.
>> ];
>> return [
>> 1, 2, 3, <-- Align to opening bracket (option 2).
>> 4, 5, 6, <-- Align to prev line’s sibling.
>> ]; <-- Align to opening bracket.
>> for (int i = 0;
>> i < 10; <-- Align to first sibling.
>> i++) { <-- Align to prev line’s sibling.
>> continue;
>> }
>> 4. Statement expressions indent one level when it’s broken into two
>> lines:
>> int main() {
>> int var
>> = 1287; <-- Indent one level.
>> int var =
>> 1287; <-- Indent one level.
>> }
>
> Should there be an example with a method call starting on a new line, line in
> the arrow literal example (for JS) that we discussed?
Yes, once we add that.
>
>> Then a C-like language’s major mode only need to add special cases over the
>> baseline indent rule. And if we add the configurable heuristic for
>> standalone-parent, the baseline indent rules would make use of it.
>
> Sounds good.
>
>> I brought it up because if we’re going to do some renovations to indent
>> rules, might as well make use of c-ts-common-baseline-indent-rule, and we
>> probably don’t even need to replace parent-box with standalone-parent,
>> because the baseline indent rule would cover most cases.
>
> I'm now sure how safe that is - my point was that for each of the languages
> it'd be great to have somebody motivated go over the main syntactic cases and
> see that the behavior is still reasonable. But we can also make the switch
> and wait for reports.
I agree, sweeping change in unfamiliar packages maintained by other people is
obviously a no-go. I’m thinking of the maintainers making the change should
they see the baseline-indent-rule beneficial. (Same goes to standalone-parent,
I’d much rather the maintainers take that call even it’s a smaller change.)
For immediate next step we can just apply the standalone-parent patch, and use
it in js. And we make baseline-indent rule support the standalone-parent
customization, and let major mode maintainers know of both. What they want to
do is up to them.
>
>> I’ve already used it to rewrite c-ts-mode indent rules and it’s been a
>> success; this baseline + override approach has been very helpful. c-ts-mode
>> still has a lot of indent rules because of things like preproc directive,
>> etc, but it’s much more manageable than before.
>> I don’t know how much it would help modes that has simpler indent rules.
>> Go-ts-mode and rust-ts-mode only has a handful of indent rules, maybe they
>> don’t really need this baseline rule. OTOH Lua and Ruby has more involved
>> indent rules, maybe they can benefit and reduce the number of rules they
>> need to define.
>
> Ruby has different delimiters (do...end or def...end or etc), and the curlies
> don't do exactly the same job that they do in C. So I'm not sure how feasible
> it is. A half of the function would be a fit, though.
Right, so I think it’ll still be more helpful than not.
Yuan
- bug#74386: Tree-sitter javascript indentation, Yuan Fu, 2024/12/01
- bug#74386: Tree-sitter javascript indentation, Dmitry Gutov, 2024/12/01
- bug#74386: Tree-sitter javascript indentation, Yuan Fu, 2024/12/01
- bug#74386: Tree-sitter javascript indentation, Dmitry Gutov, 2024/12/01
- bug#74386: Tree-sitter javascript indentation, Yuan Fu, 2024/12/01
- bug#74386: Tree-sitter javascript indentation, Yuan Fu, 2024/12/11
- bug#74386: Tree-sitter javascript indentation, Dmitry Gutov, 2024/12/11
- bug#74386: Tree-sitter javascript indentation, Yuan Fu, 2024/12/12
- bug#74386: Tree-sitter javascript indentation, Dmitry Gutov, 2024/12/12
- bug#74386: Tree-sitter javascript indentation,
Yuan Fu <=