help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

improving yysyntax_error()


From: Christian Schoenebeck
Subject: improving yysyntax_error()
Date: Wed, 20 Jun 2007 23:13:22 +0200
User-agent: KMail/1.9.5

Hi!

I would like to improve the quality of error messages produced by 
yysyntax_error(). I know the theory behind LALR(1) parsers, but unfortunately 
I'm not very used to the bison skeleton parser implementation yet, so I hope 
you can help me a bit.

First the reason: I'm strictly opposed to having split the lexer and parser 
tasks into two distinct worlds (due to many reasons). So my parsers usually 
work like this: yylex() just returns the ASCII code of the next character 
from the input stream and thus the bison grammars include the typical, 
trivial lexer-side rules, i.e. like:

SET : 'S''E''T' ;

SUBSCRIBE : 'S''U''B''S''C''R''I''B''E' ;

Now the problem is, if there's a syntax error within these trivial rules, the 
yysyntax_error() function will just reflect the next expected character. I.e. 
the input "SUBfoo" would result in the error message: "syntax error, 
unexpected 'f', expecting 'S'". Obviously, returning the whole rule's symbol 
name would make more sense here, that is: "syntax error, unexpected 'SUBfoo', 
expecting 'SUBSCRIBE'".

So I thought about adding a new keyword to the bison declaration section, 
i.e. "%atomic" like:

%atomic SET SUBSCRIBE

Which would tell bison, that the rules of the listed non-terminal symbols are 
so trivial, that they don't matter in i.e. error messages and accordingly, 
yysyntax_error() would reflect the expected non-terminal symbol name(s), 
instead of the expected next character.

What do you think about that suggestion in general?

To implement this, is there an easy way (i.e. by table lookup) in the bison 
skeleton parser to retrieve the rule numbers of the expected upcoming 
reduction(s) for a given parser state? For example "bison -v" would show me:

state 13

  457 SET: 'S' . 'E' 'T'
  458 SUBSCRIBE: 'S' . 'U' 'B' 'S' 'C' 'R' 'I' 'B' 'E'

    'E'  shift, and go to state 52
    'U'  shift, and go to state 53

thus after the input "S", it would either expect an upcoming reduction of rule 
457 or 458. After having the rule numbers it's easy to resolve the human 
readable symbol names in the skeleton, but so far I'm a bit unoriented how to 
get the rule numbers of the upcoming reductions. Any hints?

CU
Christian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]