nano-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nano-devel] [PATCH] syntax: lua: fix word boundaries on standard li


From: Benno Schulenberg
Subject: Re: [Nano-devel] [PATCH] syntax: lua: fix word boundaries on standard library functions
Date: Sat, 30 Dec 2017 18:01:53 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0


[If you are subscribed, don't CC me, just answer to the list.]

# File handle methods color brightyellow "\:\<(close|flush|lines|read|seek|setvbuf|write)\>"

In this regex, I think the leading backslash is unneeded, and the \< is unneeded too. Can you confirm? (As I have no clue about Lua.)

Confirmed (disclaimer: I am not a Lua expert). There is a similar unneeded \> in the "Keywords" rule (below).

Right.  I have grepped the syntaxes for "[^"]\\\<" and "\\\>[^"]"
and have removed the word boundaries that I think are unneeded.
Commit 1f48ed04.

I'd fix the \: but keep the \< for symmetry. What do you think?

No symmetry is needed.  The \< is simply superfluous.

a. If I write invalid code such as "io.foo(...)", nano will highlight "io." but not "foo". This correctly tells me that "io" is a known library but it has no function named "foo". It looks odd (I would expect only "io" to be highlighted), but that might be considered a good thing because it draws attention to a potential error.

But then I would use a different color, brightred, to really draw attention.
Either that, or not color such partial names at all.

b. Robustness. The function "table.unpack" is missing from the "Standard library" rule but "table.unpack(...)" still gets highlighted correctly
because "table." matches the above regex and "unpack(" matches another regex

Ugh.  It colors the leading parenthesis too?  Ugly.

from the "Keywords" section. This is not a coincidence: "unpack" used to be a
global function but was moved to the "table" library in Lua 5.2. If other functions are moved this way in future, the above regex would handle them. On the flip side, this robustness works against point (a) because invalid code such as "math.unpack(...)" gets highlighted as if it was valid.

Precisely.  It should either color only valid library functions, or just
color anything that starts with an existing library name.  But not some
strange hybrid.

# Hex literals color red "0x[0-9a-fA-F]*"

The * should be +, right?  Because 0x by itself would be wrong?

Correct. The description in the reference manual is not precise, but the Lua interpreter rejects 0x so it's definitely illegal (tested with Lua 5.3). This regex should also use \< and \>.

Okay.

# Numbers color red "\<([0-9]+)\>"

# Symbols color brightmagenta "(\(|\)|\[|\]|\{|\})"

The outer parentheses in both regexes are redundant.

True.  But I would write it as: [][)(}{].

# Numbers color red "\<([0-9]+)\>"

Numbers can have a decimal point and an exponent (e.g. 1.2e3 == 1200). Currently a decimal point with no exponent looks fine (digits are red, point is black). However, exponents look horrible (the "e" and the digits on either side are not highlighted). I would do it like this:

icolor red "\<[0-9]+(\.[0-9]*)?(e[+-]?[0-9]+)?\>"

icolor red "\<\.[0-9]+(e[+-]?[0-9]+)?\>"

These rules are good, but not perfect: "print(1.)" is valid Lua, but the "." would not be highlighted because \> only matches after a word character. The "1" would be highlighted though, which I think is good enough.

If not coloring a trailing dot is good enough, then not coloring a
leading dot is good enough too.  Then you can drop the second regex.

Also, try to avoid using icolor.  It is only used because the e can be E
too?  If so, them symply write [eE] instead of applying case-insensitivity
to the entire regex.

# Hex literals color red "0x[0-9a-fA-F]*"

Hex literals got more complicated in Lua 5.2:

Hexadecimal constants also accept an optional fractional part plus an optional binary exponent, marked by a letter 'p' or 'P'.

:|

I would do something analogous to the decimal rules above:

icolor red "\<0x[0-9a-f]+(\.[0-9a-f]*)?(p[+-]?[0-9]+)?\>"

icolor red "\<0x\.[0-9a-f]+(p[+-]?[0-9]+)?\>"

Okay, but the same as for the decimal: just one regex, and no i.

# Strings color red "\"(\\.|[^\\\"])*\"|'(\\.|[^\\'])*'"

There are two problems here. Firstly, backslashes inside bracket expressions "[...]" lose their special meaning so they don't need to be doubled.

True.

Secondly, from doc/nanorc.5:

Quotes inside these string parameters don't have to be escaped with backslashes. The last double quote in the string will be treated as its end.

Yes.  But in the past I found that Python triple quotes worked
differently (less well) when unbackslashed than when backslashed.
But I cannot reproduce that now, so maybe that was an effect of
some subtle bugs in the coloring code back then.  So, let's try
to remove redundant backslashes.

So \" should be changed to ". Result:

color red ""(\\.|[^"\])*"|'(\\.|[^'\])*'"

Okay.  You've tested this, right?  Your strings in Lua scripts
still get colored correctly?

I also reordered the backslash and double/single quotes in the bracket expressions to make it clear that they are not escape sequences.

Surprisingly, the original regular expression works correctly despite the errors.

They were not real errors, just superfluous use and repetition of
backslashes.

Benno



reply via email to

[Prev in Thread] Current Thread [Next in Thread]