bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rename variant and lex_symbols options


From: Akim Demaille
Subject: Re: Rename variant and lex_symbols options
Date: Thu, 23 Feb 2012 17:01:42 +0100

Le 17 févr. 2012 à 03:57, Joel E. Denny a écrit :

> Hi Akim.

Hi Joel!

Thanks a lot for your answer.  I will try not to drop the ball
this time on this regard.  The more I think about it, the more
I'd be happy that we used more a ticketing system, either
that of Savannah, or the deb-bug stuff that the gnulibers seems
to be fond of.

>> * variant
>> The point of "variant" is to allow objects (not pointers
>> to objects) to be used to type the symbols in the C++
>> LR parser.
>> 
>> So we have, for instance :
>> 
>>> %token <::std::string> TEXT;
>>> %token <int> NUMBER;
>>> 
>>> list:
>>>  /* nothing */ { /* Generates an empty string list */ }
>>> | list item     { std::swap ($$, $1); $$.push_back ($2); }
>>> ;
>> 
>> It does have an influence on the API, since yylval can no
>> longer be used "simply".  So it could be something like
>> api.symbols.variant, or api.values.variant...
> 
> api seems reasonable given that it does affect the generated API exposed 
> to the scanner, at least.

Yes, in the sense that it changes yystype.  The test c++.at:variant
shows a number of possibilities to define yylval.  There are basically
three when using variants.  The first two are equivalent for bison,
but for the user there is a difference: either build with default,
and then assign:

      yylval->build<std::string>() = yytext;
      *yylloc = location_type ();
      return token::TEXT;

or build with value.

      yylval->build (yytext);
      *yylloc = location_type ();
      return token::TEXT;

If in addition to request lex_symbol, yylval is no longer a triple
with absolute independence of the semantical value and the token
kind, but a single object that binds the type of the semantical
value with the token type:

      return yy::parser::make_TEXT (yytext, location_type ());

> I believe we had come to the conclusion that we should avoid Boolean 
> variables from now on.  

You are absolutely right!  Thank you very much for this reminder,
I have write it down somewhere (HACKING).

> The rationale was that we so often outgrow 
> true|false with some other possibility we didn't originally think of.  
> Could we have api.value = union|variant?  Also, notice the use of singular 
> as discussed below.

That's nice.  I would also be very happy to no longer
suggest to #define YYSTYPE double, but say

%define api.value custom
%define api.value.type double

or something like that.  Maybe

%define api.value <double>

I don't know.  BTW, we dropped the "token" part, is this
on purpose?

%define api.token.value?

Of course there are not only tokens, but also nterms.

%define api.symbol.value

>> * lex_symbols
>> The point here is to provide an API to build the symbols
>> in such a way that it is not possible to return a semantic
>> value incompatible with the token kind (e.g.,
>> [0-9]+   yylval.sval = yytext; return INTEGER;):
>> 
>> instead you write:
>> 
>>> [0-9]+   return yy::parser::make_INTEGER(text_to_int (yytext), loc);
>>> [a-z]+   return yy::parser::make_IDENTIFIER(yytext, loc);
>>> ":"      return yy::parser::make_COLON(loc);
>> 
>> Again, it has an influence on the API, so maybe
>> api.tokens.constructors (we already have api.tokens.prefix
>> which probably should have been api.token.prefix),
> 
> This is my fault.  My logic was to use plural when there's more than one 
> of something.  However, I now see that always using singular is probably a 
> simpler rule to remember and not really so misleading as I thought.  For 
> example, lr.default-reduction, lr.keep-unreachable-state, and 
> api.token.prefix would all have been fine.

So I will deprecate api.tokens.prefix to api.token.prefix, no worries.
I'll also check the others.

>> or api.token.object.
>> 
>> Both are meant to be used together.  Maybe actually I should
>> enforce this so that there are less combinations to check.
> 
> So, lex_symbols can't be used without variant?  Is it possible that might 
> ever change?  Sorry, I haven't studied the details.

You are right, there is nothing intrinsically that forbids this.  But
then I really need more information than is currently provided to bison.
In the example above:

> [a-z]+   return yy::parser::make_IDENTIFIER(yytext, loc);
> [0-9]+   return yy::parser::make_INTEGER(text_to_int (yytext), loc);
> ":"      return yy::parser::make_COLON(loc);

all these work because Bison can make the type to the token types:

> %token <::std::string> IDENTIFIER;
> %token <int> INTEGER;
> %token COLON;

With %union, I can't.  So maybe I need to introduce some other
concept, which is the fact that it is not type tags that are
used, but genuine types.

Too many things for the sole api.value :(

variants requires types (not tags).
lex_symbols currently requires variants, but could require only types.

%define api.symbol.value  tag  // the default, <ival> denotes a field in YYSTYPE
%define api.symbol.value  type // %token <int> INT

%define api.symbol.value.type union // the default, including for union
%define api.symbol.value.type variant

// Defaults to YYSTYPE
%define api.symbol.value.type.name foo     // instead of #define YYSTYPE foo.

Too many options, things start to blur :(

> If lex_symbols will only ever make sense with variant, then maybe we need 
> to extend the api.value enum.  variant-constructor?  I'm not sure.
> If that doesn't seem right, I think api.token.constructor is fine.  But 
> again, should we avoid the Boolean so it can grow if necessary?  Maybe 
> none|variant, which would at least make it clearer that it's a companion 
> for api.variant.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]