help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inserting extra tokens


From: Hans Aberg
Subject: Re: Inserting extra tokens
Date: Fri, 18 Aug 2006 09:24:38 +0200

On 17 Aug 2006, at 08:51, Erik Sandberg wrote:

2. When a function argument has been shifted, the parser artificially inserts
a special token as the next token. The token is a COMMA or SEMICOLON
depending on whether the shifted argument was the last argument.
...
A problem in (2) is _how_ to insert the extra token correctly. I first tried to put flex in a state where it sends the desired token without reading anything. This was however not sufficient: In some cases, the parser has already read the next token for the lookahead when an argument is shifted.

There is, in general, no good way to insert a token, as the LALR(1) algorithm that Bison uses to create the parser may or may not need a lookahead token in each parsing position (i.e., a set of rules, each of with a dot in it, as in the states of the .output file that Bison can write). Thus, one does not know what tokens have been read when the parser is in a particular state and its parsing position.

I am working with the parser of GNU LilyPond, and I want to improve the way function invocations are parsed in the language. The language uses a syntax for functions which is somewhat similar to LaTeX's syntax for macros. E.g., if \foo is declared as a binary function, then \foo a b calls the function
with parameters a and b.

I'm looking for a way to express this syntax in bison, in a generic way
(currently only a limited number of function arities are possible).

If one is only implementing Prolog or Haskell style operator precedences, there are two methods in use:

If the number or precedences are small, as in Haskell which only has about ten, one can list all the parsing possibilities in the .y file, and then type the tokens accordingly. If the number is large, as in Prolog, then one gives all the operator tokens one type, like "operator", and lets the .y rule action to put the expression operators into a stack. Then, after the expression has been parsed, one lets a C/C++ function to sort out the expression using the operator precedences.

3. The argument list uses a grammar similar to
arglist: argument COMMA arglist | argument SEMICOLON ;
(an argument can be a complex expression)

Now, I do not see exactly how this precedence problem relates to yours, as you have a different syntax. But if you only admit a limited number of arities, you could list them all in the .y grammar. Otherwise, you will have to use the other method indicated above: create a dynamic arguments list object, and then use the token arity to work it out after the parsing of the rule.

  Hans Aberg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]