[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Java push parser
From: |
Paolo Bonzini |
Subject: |
Re: Java push parser |
Date: |
Mon, 21 Jan 2013 16:03:37 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
Il 21/01/2013 14:28, Akim Demaille ha scritto:
>
> Le 15 janv. 2013 à 20:31, Dennis Heimbigner <address@hidden> a écrit :
>
>> I am submitting code to
>> add push parsing support for
>> Java through Bison.
>>
>> The relevant code can be obtained
>> using this url:
>> http://www.unidata.ucar.edu/staff/dmh/push.tar
>> That tar file contains three files:
>> 1. lalr1.java - revised skeleton to add push parsing support
>
> Hi Dennis,
>
> it will be much easier to discuss your patch if it is
> included in your message. It is also better for the mailing
> list archive.
>
> Paolo, I'm cc'ing this to you as the original author of
> lalr1.java, you might want to drop in some comments too.
>
> --- data/lalr1.java 2013-01-15 15:16:54.000000000 +0100
> +++ /Users/akim/Downloads/push/lalr1.java 2013-01-15 20:10:31.000000000
> +0100
> @@ -15,6 +15,34 @@
> # You should have received a copy of the GNU General Public License
> # along with this program. If not, see <http://www.gnu.org/licenses/>.
>
> +dnl Modified by Dennis Heimbigner (address@hidden)
> +dnl to support push parsing.
> +dnl
> +dnl Changes:
> +dnl
> +dnl 1. capture the declarations as m4 macros.
> +dnl This was done to avoid duplication.
> +dnl When push parsing, the declarations occur at
> +dnl the class instance level rather than within the parse() function.
> +dnl
> +dnl 2. Initialization of the declarations occurs in a function
> +dnl called push_parse_initialize() that is called on the first
> +dnl invocation of push_parse().
> +dnl
> +dnl 3. The body of the parse loop is modified to return values at
> +dnl appropriate points when doing push parsing. In order to
> +dnl make push parsing work, it was necessary to divide
> +dnl YYNEWSTATE into two states: YYNEWSTATE and YYGETTOKEN. On
> +dnl the first call to push_parse, the state is YYNEWSTATE. On
> +dnl all later entries, the state is set to YYGETTOKEN. The
> +dnl YYNEWSTATE switch arm falls through into
> +dnl YYGETTOKEN. YYGETTOKEN indicates that a new token is
> +dnl potentially needed. Normally, with a pull parser, this new
> +dnl token would be obtained by calling yylex(). In the push
> +dnl parser, the value YYMORE is returned to the caller. On the
> +dnl next call to push_parse(), the parser will return to the
> +dnl YYGETTOKEN state and continue operation.
Looks good, just one small stylistic problem in the name of the variable
havenexttoken. I'd reverse the direction and call it
push_token_consumed, but that's pretty much it.
Also:
> +]b4_push_if([
> + /**
> + * Returned by a Bison action in order to stop the parsing process and
> + * return failure (<tt>false</tt>). */
> + public static final int YYMORE = 4;])[
> +
Wrong comment.
> + if(!push_parse_initialized) {
> + push_parse_initialize();
> + label = YYNEWSTATE;
Perhaps you can make "label" global too, so the check can be instead for
"if (label != YYGETTOKEN)"? Then the setting of label to YYNEWSTATE can
be moved to push_parse_initialize(), and this hunk can go away too:
>
> + if (yychar == Lexer.EOF)
> +]b4_push_if([ {label = YYABORT; break;}],[
> return false;])[
> + }
> + else
> yychar = yyempty_;
> }
>
> @@ -684,7 +790,7 @@
>
> /* Pop the current state because it cannot handle the error
> token. */
> if (yystack.height == 0)
> - return false;
> +]b4_push_if([ {label = YYABORT; break;}],[
> return false;])[
>
> ]b4_locations_if([yyerrloc = yystack.locationAt (0);])[
> yystack.pop ();
Paolo