[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: symbol_type::token () removed?
From: |
Akim Demaille |
Subject: |
Re: symbol_type::token () removed? |
Date: |
Thu, 2 Jan 2020 08:27:42 +0100 |
Hi Wolgang,
Finally a bug report! 3.5 was released three weeks ago, and usually
bug reports arrive very soon after the release. I was getting worried :)
> Le 1 janv. 2020 à 23:51, Wolfgang Thaller <address@hidden> a écrit :
>
> Commit 8c87a623 says:
>
>> c++: get rid of symbol_type::token ()
>> It is not used. And its implementation was wrong when api.token.raw
>> was defined, as it was still mapping to the external token numbers,
>> instead of the internal ones. Besides it was provided only when
>> api.token.constructor is defined, yet always declared.
>>
>> * data/skeletons/c++.m4 (by_type::token): Remove, useless.
>
> Well, I disagree. It's not useless, I'm using it,
:)
About a year ago, I wrote to you
>> The thing is that I don't know how people are using these types: they do
>> things that are not documented, and as such, are not considered part of the
>> contract.
and that's another instance of that pattern.
> and I think my use case is legitimate.
>
> [...]
> So I have things like:
>
> Lexer lex("12345");
> symbol_type t = lex.nextToken();
> assert(t.token() == token::INTLIT);
> assert(t.value.as<int>() == 12345);
which gives me the impression that the real use case would be something like
assert(t == token::make_INTLIT(12345))
WDYT? Hum. That would require operator==, which would generate a lot
of code. And it would require the value types to support ==, which so
far we do not require.
> ... which suddenly stopped working.
>
> Fortunately, I can instead do...
>
> assert(t.type_get() == by_type(token::INTLIT).type_get());
>
> ... but that's a bit ugly.
That's because yytranslate_ is private, otherwise it could be
assert(t.type_get() == yytranslate_(token::INTLIT));
> In general, I think symbol_type should always provide enough public API
> in order to allow inspecting it. In my case, I only need it for
> implementing tests, but it general it might be useful when writing the
> lexer, as well.
The reason I was very happy to get rid of it is that it's kind of big:
it includes a table whose size is the number of tokens. I try to keep
our parsers thin. Commit 8c87a623 is:
- ]b4_inline([$1])b4_parser_class[::token_type
- ]b4_parser_class[::by_type::token () const YY_NOEXCEPT
- {
- // YYTOKNUM[NUM] -- (External) token number corresponding to the
- // (internal) symbol number NUM (which must be that of a token). */
- static
- const ]b4_int_type_for([b4_toknum])[
- yytoken_number_[] =
- {
- ]b4_toknum[
- };
- return token_type (yytoken_number_[type]);
- }
Have you given a shot at api.token.raw? In that case symbol_type::token()
would be really thin: it would just return 'type'.
> P.S.: Keep up the good work, bison is my favorite way of parsing things
> using C++....
Thanks a lot for saying :)