Re: improving yysyntax

help-bison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: improving yysyntax_error()

From:	Hans Aberg
Subject:	Re: improving yysyntax_error()
Date:	Thu, 21 Jun 2007 16:08:16 +0200

It happened on 21 Jun 2007, at 15:43, that Christian Schoenebeck wrote:

Es geschah am Thursday, 21. June 2007 13:12 als Hans Aberg schrieb:
On the one hand, you try to use Bison for something it wasn't
designed for, so unless to can come up with good motivations, getting
the change is unlikely to happen.
Depends on which motivation you mean. A motivation for doing thelexer task onbison side? Or a motivation just for the suggested new declarationkeyword?


Whatever changes you want to come true.

Remember that the development is
done by volunteers that do what they want.


I know, I work on open source projects as well. Nevertheless I want to
implement this.

Then it is probably more appropriate for the Bug-Bison and Bison-Patches lists.

In case it'll be accepted for a future release of bison,
fine, if not I'll keep it for my own use and for all others whomight want touse it as well, because I'm quite sure I'm not the only one usingbisonwithout an external lexer. And in these cases you need somethinglike the
suggested new feature, to get useful error messages.
On the other hand, one way to extend Flex & Bison to Unicode is to
use UTF-8, and let Flex return a sequence of characters. Then this
can be used in Bison, too, if 'c_1...c_k' expands to 'c_1'...'c_k'.
Then the problem is to generate the full UTF-8-character in error
messages, not just the leading byte. I do not know how to implement
this, though.
Right, this would be the exact same application, that is moving thelexer taskon bison side and yylex() only to return the next byte from theinput stream,
that is simply byte by byte.

No, in this model, the lexer matches patterns as usual, only when thematch has been made, returns the multibyte character byte by byte.

For supporting UTF-8 characters, you could
define "atomic" grammar rules, like:

UTF8_CAPITAL_PI : '\316' '\240' ;

UTF8_CAPITAL_OMEGA : '\316' '\231' ;

and use the suggested new declaration keyword "%atomic" like:

%atomic UTF8_CAPITAL_PI UTF8_CAPITAL_OMEGA
to tell bison the right hand side of those UTF-8 character rules(that istheir byte sequene) is too trivial / unteresting to be shown inerrors and
yysyntax_error() would i.e return:
"syntax error, unexpected 'UTF8_CAPITAL_OMEGA', expecting'UTF8_CAPITAL_PI"
instead of:

"syntax error, unexpected '\231', expecting '\240"
The latter would be completely useless and confusing for regularusers.

The stuff above isn't needed, except for the generation of errormessages. So how do you intend to implement your %atomic construct?


  Hans Aberg

[Prev in Thread]

Current Thread

[Next in Thread]

improving yysyntax_error(), Christian Schoenebeck, 2007/06/20
- Re: improving yysyntax_error(), Hans Aberg, 2007/06/21
  - Re: improving yysyntax_error(), Christian Schoenebeck, 2007/06/21
    - Re: improving yysyntax_error(), Hans Aberg <=
    - Re: improving yysyntax_error(), Christian Schoenebeck, 2007/06/21
    - Re: improving yysyntax_error(), Hans Aberg, 2007/06/21
    - Re: improving yysyntax_error(), Christian Schoenebeck, 2007/06/21
    - Re: improving yysyntax_error(), Hans Aberg, 2007/06/21
    - Re: improving yysyntax_error(), Christian Schoenebeck, 2007/06/21
    - Re: improving yysyntax_error(), Hans Aberg, 2007/06/21
    - Re: improving yysyntax_error(), Christian Schoenebeck, 2007/06/21

Prev by Date: Re: improving yysyntax_error()
Next by Date: Re: improving yysyntax_error()
Previous by thread: Re: improving yysyntax_error()
Next by thread: Re: improving yysyntax_error()
Index(es):
- Date
- Thread