[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: improving error message (was: bison for nlp)
From: |
Akim Demaille |
Subject: |
Re: improving error message (was: bison for nlp) |
Date: |
Sat, 10 Nov 2018 12:50:16 +0100 |
> Le 10 nov. 2018 à 10:38, Hans Åberg <address@hidden> a écrit :
>
>> Also, see if using %param does not already
>> give you what you need to pass information from the scanner to the
>> parser’s yyerror.
>
> How would that get into the yyerror function?
In C, arguments of %parse-param are passed to yyerror. That’s why I mentioned
%param, not %lex-param. And in the C++ case, these are members.
>>>> I believe that the right approach is rather the one we have in compilers
>>>> and in bison: caret errors.
>>>>
>>>> $ cat /tmp/foo.y
>>>> %token FOO 0xff 0xff
>>>> %%
>>>> exp:;
>>>> $ LC_ALL=C bison /tmp/foo.y
>>>> /tmp/foo.y:1.17-20: error: syntax error, unexpected integer
>>>> %token FOO 0xff 0xff
>>>> ^^^^
>>>> I would have been bothered by « unexpected 255 ».
>>>
>>> Currently, that’s for those still using only ASCII.
>>
>> No, it’s not, it works with UTF-8. Bison’s count of characters is mostly
>> correct. I’m talking about Bison’s own location, used to parse grammars,
>> which is improved compared to what we ship in generated parsers.
>
> Ah. I thought of errors for the generated parser only. Then I only report
> byte count, but using character count will probably not help much for caret
> errors, as they vary in width. Then problem is that caret errors use two
> lines which are hard to synchronize in Unicode. So perhaps some kind of one
> line markup instead might do the trick.
Two things:
One is that the semantics of Bison’s location’s column is not specified:
it is up the user to track characters or bytes. As a matter of fact, Bison
is hardly concerned by this choice; rather it’s the scanner that has to
deal with that.
The other one is: once you have the location, you can decide how to display
it. In the case of Bison, I think the caret errors are fine, but you
could decide to do something different, say use colors or delimiters, to
be robust to varying width.
>>> I am using Unicode characters and LC_CTYPE=UTF-8, so it will not display
>>> properly. In fact, I am using special code to even write out Unicode
>>> characters in the error strings, since Bison assumes all strings are ASCII,
>>> the bytes with the high bit set being translated into escape sequences.
>>
>> Yes, I’m aware of this issue, and we have to address it.
>
> For what I could see, the function that converts it to escapes is sometimes
> applied once and sometimes twice, relying on that it is an idempotent.
It’s a bit more tricky than this. I’m looking into it, and I’d like
to address this in 3.3.
>> We also have to provide support for internationalization of
>> the token names.
>
> Personally, I don't have any need for that. I use strings, like
> %token logical_not_key "¬"
> %token logical_and_key "∧"
> %token logical_or_key "∨"
> and in the case there are names, they typically match what the lexer
> identifies.
Yes, not all the strings should be translated. I was thinking of
something like
%token NUM _("number")
%token ID _("identifier")
%token PLUS "+"
This way, we can even point xgettext to looking at the grammar file
rather than the generated parser.
- Re: bison for nlp, (continued)
- Re: bison for nlp, Hans Åberg, 2018/11/08
- Re: bison for nlp, r0ller, 2018/11/08
- Re: bison for nlp, Hans Åberg, 2018/11/08
- Re: bison for nlp, r0ller, 2018/11/08
- Re: bison for nlp, Akim Demaille, 2018/11/08
- Re: bison for nlp, Hans Åberg, 2018/11/09
- Re: bison for nlp, Akim Demaille, 2018/11/09
- Re: bison for nlp, Hans Åberg, 2018/11/09
- improving error message (was: bison for nlp), Akim Demaille, 2018/11/10
- Re: improving error message (was: bison for nlp), Hans Åberg, 2018/11/10
- Re: improving error message (was: bison for nlp),
Akim Demaille <=
- Re: improving error message, Hans Åberg, 2018/11/10
- Re: bison for nlp, Akim Demaille, 2018/11/09
- Re: bison for nlp, r0ller, 2018/11/12
- Re: bison for nlp, r0ller, 2018/11/19
- Re: bison for nlp, Akim Demaille, 2018/11/20
- Re: bison for nlp, r0ller, 2018/11/21
- Re: bison for nlp, Akim Demaille, 2018/11/23
- Re: bison for nlp, r0ller, 2018/11/27
- Re: bison for nlp, Akim Demaille, 2018/11/27
- Re: bison for nlp, r0ller, 2018/11/27