lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue 2702 in lilypond: Patch: Unify the lexer's idea of words and c


From: David Kastrup
Subject: Re: Issue 2702 in lilypond: Patch: Unify the lexer's idea of words and commands across all modes.
Date: Mon, 30 Jul 2012 00:10:12 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux)

Graham Percival <address@hidden> writes:

> On Sun, Jul 29, 2012 at 10:15:10PM +0200, David Kastrup wrote:
>> 
>> Forwarding this from the lilypond-auto list since this review concerns
>> an important syntax change.  GLISS material, in a manner.  Consider it a
>> proposal for word and command syntax across _all_ lexer modes.
>
> What's a "lexer mode" ?  I haven't gotten around to taking
> Coursea's compilers course.

The lexer turns character sequences into tokens.  LilyPond's lexer has
different internal modes in which it generates different tokens from the
same character sequences.  Tokens are the principal units in LilyPond's
grammar (the uppercase expressions in the "Grammar" appendix in the
Notation manual).

Usually, a lexer has only one mode or state in which it forms tokens.
LilyPond has the following modes:

INITIAL, extratoken, chords, figures, incl, lyrics, lyric_quote,
longcomment, markup, notes, quote, sourcefileline, sourcefilename,
version

Most of those are in some manner trivial or limited (producing zero or
one token), but
INITIAL, chords, figures, lyrics,  markup, notes
are fullblown modes triggered by the parser (rather than the lexer
itself) and producing a string of tokens using its respective own rules.

Commands and words are mode-dependent recognized in three different manners:

A               [a-zA-Z\200-\377]
AA              {A}|_
N               [0-9]
AN              {AA}|{N}
DASHED_WORD             {A}({AN}|-)*
DASHED_KEY_WORD         \\{DASHED_WORD}
ALPHAWORD       {A}+
NOTECOMMAND     \\{A}+
MARKUPCOMMAND   \\({A}|[-_])+

In INITIAL mode, DASHED_WORD and DASHED_KEY_WORD recognize words and
keywords.  INITIAL mode is the only mode in which assignments are
allowed, so DASHED_WORD is the only permissible word syntax for
_defining_ a command via assignment (which has the form "string =
scalar" with string being represented by a quoted string or by a word).
NOTECOMMAND and MARKUPCOMMAND are specialized forms of DASHED_KEY_WORD.
Except that \-o can be a MARKUPCOMMAND but not a DASHED_KEY_WORD which
can't start with a dash as first character after the backslash.  A
DASHED_KEY_WORD can contain digits, but this is mostly useless since you
can't use them except in INITIAL mode.

>> One consequence is that if you can use \commandname in some context, it can  
>> be defined using
>> commandname = ...
>> without requiring quote marks since word syntax and command name syntax are  
>> in direct correspondence.
>
> I don't see this as a huge benefit.  What's wrong with
>   commandname = { ... }
> ?

The difference is to the left of the equals sign.  You can write
command-name = ...
instead of
"command-name" = ...

Embarrassingly, you could actually always do this already, but there are
snippets that don't know this.  Probably my fault to start this trend.
More importantly, you can reference \command-name in any mode, including
notes mode which is the default in #{ ... #}.

>> Being able to access every definition equally  
>> well in every lexer mode is also an advantage.  The word definition is  
>> palindromic: iff a character sequence is a word, so is its reverse.
>
> If this means what I think it means, then huh?  so I can do this
> now?
>   music = { c'4 d e f }
>   { \cisum }
> and have it compile?

No.  If \music is a command to the lexer, then \cisum is a command to
the lexer.  That does not mean that they have the same definition.  Just
the same kind of lexical unit.

>> If -  or _ is both preceded as well as followed by an alphabetic
>> character, it  integrates into a word.
>
> That sounds nice, although my time with programming languages
> kind-of discourages the notion of
>   violin-one
> instead of
>   violin_one
> .  However, if it's totally safe to write "violin-one" then that
> would certainly be nice for normal musicians!  It also saves one
> keypress.

It is not "totally safe".  Previously { c-flat } was a well-formed
LilyPond program.  Afterwards, it would not be.

>>  It turns out that this definition works with  
>> both "make test" as well as "make doc" without requiring any change in the  
>> LilyPond code base.
>
> That's certainly promising.
>
>> Discuss.  This is quite a consequential change regarding what word syntax  
>> is valid and what not, but the previous state was rather arbitrary to the  
>> degree of being wonky.
>
> Could I have some examples?  I just don't get this "word"
> business.  Is there any syntax which was previously
> (theoretically) supported, which this patch breaks?

zip3 = 5\cm

You could assign to variables with numbers in their name.  This feature
has seen little use since you can't reference them in music short of
using $.

{ c-flat }

A "word" is allowed as a string in music, and a string can be used as a
markup in most circumstances (when it does not happen to match a
notename).  I am not aware of anybody having used this, but if anybody
does a lot of experimentation and/or Flex code reading, it is
conceivable that code using this exists.  I mean, at one point of time
the definition of a "word" included TeX sequences like \^ and \' and \`
and \".

-- 
David Kastrup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]