[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unify the lexer's idea of words and commands across all modes. (issu
From: |
dak |
Subject: |
Re: Unify the lexer's idea of words and commands across all modes. (issue 6445056) |
Date: |
Tue, 31 Jul 2012 10:52:41 +0000 |
Reviewers: Trevor Daniels,
Message:
On 2012/07/31 08:47:53, Trevor Daniels wrote:
Just a query really, to help my understanding.
Trevor
http://codereview.appspot.com/6445056/diff/1/lily/lexer.ll
File lily/lexer.ll (right):
http://codereview.appspot.com/6445056/diff/1/lily/lexer.ll#newcode390
lily/lexer.ll:390: <chords,notes,figures>{RESTNAME}/[-_] |
Why is this trailing context added? I don't see
what this would match that wouldn't be matched
by the following line.
Flex picks the longest matching pattern. Apparently that includes
trailing contexts. Without this pattern, r-. does not trigger the
{RESTNAME} rule but rather the {WORD}/[-_] rule coming later. And for
the {WORD}/[-_] rule, the trailing context is needed to keep flex from
requiring backup states.
Description:
Unify the lexer's idea of words and commands across all modes.
A "word" (a string recognized even when not quoted) and a "command"
(something starting with \ and followed by letters and other folderol,
indicating a Scheme control sequence or similar) get the same syntax
in all modes:
A "word" is a sequence of alphabetic characters possibly containing
single dashes or underlines inside (not at the beginning or end).
A "command" is a "word" preceded by a backslash.
This is a large syntax change. It should not be put on countdown
automatically.
Please review this at http://codereview.appspot.com/6445056/
Affected files:
M lily/lexer.ll
Index: lily/lexer.ll
diff --git a/lily/lexer.ll b/lily/lexer.ll
index
07bb33c6f7187979356d7e6ffac05cca3ba095a5..0824f21eb356e918251c458e692170694b55a227
100644
--- a/lily/lexer.ll
+++ b/lily/lexer.ll
@@ -150,18 +150,14 @@ SCM (* scm_parse_error_handler) (void *);
A [a-zA-Z\200-\377]
AA {A}|_
N [0-9]
-AN {AA}|{N}
ANY_CHAR (.|\n)
PUNCT [][()?!:'`]
SPECIAL_CHAR [&@]
NATIONAL [\001-\006\021-\027\031\036]
TEX {AA}|-|{PUNCT}|{NATIONAL}|{SPECIAL_CHAR}
-DASHED_WORD {A}({AN}|-)*
-DASHED_KEY_WORD \\{DASHED_WORD}
+WORD {A}([-_]{A}|{A})*
+COMMAND \\{WORD}
-
-
-ALPHAWORD {A}+
UNSIGNED {N}+
E_UNSIGNED \\{N}+
FRACTION {N}+\/{N}+
@@ -171,8 +167,6 @@ WHITE [ \n\t\f\r]
HORIZONTALWHITE [ \t]
BLACK [^ \n\t\f\r]
RESTNAME [rs]
-NOTECOMMAND \\{A}+
-MARKUPCOMMAND \\({A}|[-_])+
LYRICS ({AA}|{TEX})[^0-9 \t\n\r\f]*
ESCAPED [nt\\'"]
EXTENDER __
@@ -393,15 +387,18 @@ BOM_UTF8 \357\273\277
error (_ ("end quote missing"));
exit (1);
}
+<chords,notes,figures>{RESTNAME}/[-_] |
<chords,notes,figures>{RESTNAME} {
char const *s = YYText ();
yylval.scm = scm_from_locale_string (s);
return RESTNAME;
}
+<chords,notes,figures>q/[-_] |
<chords,notes,figures>q {
return CHORD_REPETITION;
}
+<chords,notes,figures>R/[-_] |
<chords,notes,figures>R {
return MULTI_MEASURE_REST;
}
@@ -476,11 +473,13 @@ BOM_UTF8 \357\273\277
}
<notes,figures>{
- {ALPHAWORD} {
+ {WORD}/[-_] |
+ {WORD} {
return scan_bare_word (YYText_utf8 ());
}
- {NOTECOMMAND} {
+ {COMMAND}/[-_] |
+ {COMMAND} {
return scan_escaped_word (YYText_utf8 () + 1);
}
{FRACTION} {
@@ -537,7 +536,8 @@ BOM_UTF8 \357\273\277
yylval.scm = scm_c_read_string (YYText ());
return UNSIGNED;
}
- {NOTECOMMAND} {
+ {COMMAND}/[-_] |
+ {COMMAND} {
return scan_escaped_word (YYText_utf8 () + 1);
}
{LYRICS} {
@@ -563,10 +563,12 @@ BOM_UTF8 \357\273\277
}
}
<chords>{
- {ALPHAWORD} {
+ {WORD}/[-_] |
+ {WORD} {
return scan_bare_word (YYText_utf8 ());
}
- {NOTECOMMAND} {
+ {COMMAND}/[-_] |
+ {COMMAND} {
return scan_escaped_word (YYText_utf8 () + 1);
}
{FRACTION} {
@@ -598,7 +600,7 @@ BOM_UTF8 \357\273\277
return CHORD_CARET;
}
. {
- return YYText ()[0]; // ALPHAWORD catches all multibyte.
+ return YYText ()[0]; // WORD catches all multibyte.
}
}
@@ -607,7 +609,8 @@ BOM_UTF8 \357\273\277
\\score {
return SCORE;
}
- {MARKUPCOMMAND} {
+ {COMMAND}/[-_] |
+ {COMMAND} {
string str (YYText_utf8 () + 1);
int token_type = MARKUP_FUNCTION;
@@ -702,10 +705,12 @@ BOM_UTF8 \357\273\277
}
<INITIAL>{
- {DASHED_WORD} {
+ {WORD}/[-_] |
+ {WORD} {
return scan_bare_word (YYText_utf8 ());
}
- {DASHED_KEY_WORD} {
+ {COMMAND}/[-_] |
+ {COMMAND} {
return scan_escaped_word (YYText_utf8 () + 1);
}
}
- Re: Unify the lexer's idea of words and commands across all modes. (issue 6445056),
dak <=