help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposals for various changes to the Java parser


From: Di-an JAN
Subject: Re: Proposals for various changes to the Java parser
Date: Wed, 29 Oct 2008 10:31:03 -0700 (PDT)

On Tue, 28 Oct 2008, Paolo Bonzini wrote:

2. The parser class name currently defaults to ``YYParser'' (actually,
``b4_prefixParser'', but I already submitted a patch for that bug).
Do people prefer to make it match the Java file name instead?  Of course,
characters not allowed in Java names must be removed or replaced by ``_''.

Yeah, that's nice since the filename is known at m4 time.

I just noticed that JFlex actual does it the other way: setting the class
name changes the file name, but I think we should stick with Bison's output file rules.

6. Currently ``%union'' is silently ignored

I thought it gave an error. :-)

, and Java types are used as
the TYPE in ``$<TYPE>'', ``%token<TYPE> ...'' and ``%type<TYPE> ...''.
I propose to interpret these ``<TYPE>'' as a field name in ``%union'',
interpreting it as a Java type if no such field name exists.
First, this matches the behavior of C/C++ parsers, even though Java doesn't
actually have union types.  Also makes it easier to convert from C/C++.

It is a bit a waste of memory...

I'm only talking interface, not implementation.  I don't want a class
that includes all union members as fields.  Rather, the m4 code would
lookup the ``field'' name in %union and generate the cast like the existing code.

a ``generic array creation'' error, and at least one place needs to
m4-quote commas in ``<TYPE>'' (fails with Map/*String,String*/).

...  But I did not understand the comment about m4-quoting commas.

It's a bug here:

# b4_rhs_value(RULE-LENGTH, NUM, [TYPE])
# --------------------------------------
# Expansion of $<TYPE>NUM, where the current rule has RULE-LENGTH
# symbols on RHS.
#
# In this simple implementation, %token and %type have class names
# between the angle brackets.
m4_define([b4_rhs_value], [(m4_ifval($3, [($3)])[](yystack.valueAt ($1-($2))))])

Then $<Map/*String,String*/>1 would mess up the arguments of m4_ifval:
the test would be ``Map/*String'' and the if-true part would be ``String*/''. The fix is just to put an m4_quote around the test.

10. If ``%verbose-error'' is not used, do not generate code for it.

Not top priority, but I would not oppose this.  I thought javac could in
principle elide it, but maybe it does not because native methods can set
final and private fields.

errorVerbose is not declared final, so java compilers better not take out
the corresponding code.  In C parsers, YYERROR_VERBOSE is a #define.
In Java parsers, %verbose-error only provides the default state.
Should we make errorVerbose final to match C?  Or allow users to change it
since it provide more power at little cost.  As for whether to include
the code without %verbose-error, for people looking at the code Bison generates (like me), I don't mind having it in case I forgot %verbose-error
but for people looking at the code generated by their parser specs, it's
probably better not to have code that's never used.


20. Put EOF and other token type names in the Lexer interface instead. These names are supposed to be returned from the Lexer, so why require extra qualification (when not using ``%code lexer {...}'')? Also, there is no use for these names in the parser except to pass to yy_translate_ to get the internal symbol number and get some info on the grammar. And YYBACKUP and yychar is not currently supported by Java. In any case, use within the lexer would be much more common than in the parser.


21. Add ``$code init {...}'' for code added to the start of the generated parser constructor. This is particularly important with ``%define extends'' to allow the superclass to be initialized. Also add ``%define init_throws''.

Di-an Jan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]