[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: When to use a token yacc_EOF instead of relying on 0 return value of
From: |
Akim Demaille |
Subject: |
Re: When to use a token yacc_EOF instead of relying on 0 return value of yylex()? |
Date: |
Tue, 12 Feb 2019 18:15:08 +0100 |
> Le 11 févr. 2019 à 23:56, Peng Yu <address@hidden> a écrit :
>
>> I have no idea. You'd have to study the grammar to see if there
>> are doing fancy things around yacc_EOF.
>
> declare -p BASH_SOURCE
>
> Here is what I got with the above one line bash code (with the newline
> at the end).
>
> The lines with -> are the parsing rules activated. The rest lines are
> yylex() results.
>
> WORD
> WORD
> simple_command_element -> WORD
> simple_command -> simple_command_element
> simple_command_element -> WORD
> simple_command -> simple_command simple_command_element
> WORD
> simple_command_element -> WORD
> simple_command -> simple_command simple_command_element
> \n
> command -> simple_command
> pipeline -> command
> pipeline_command -> pipeline
> simple_list1 -> pipeline_command
> simple_list -> simple_list1
> simple_list_terminator -> '\n'
> inputunit -> simple_list simple_list_terminator
> yacc_EOF
> inputunit -> yacc_EOF
>
> If yacc_EOF were to be removed, do you see how to redesigned this
> subset of grammar rules? Thanks. (Since this subset is much smaller
> than the full grammar, I think it should be easy to see how to remove
> yacc_EOF, yet maintain the same functionality of the grammar?)
The question is: what is the purpose of the rules that are using
yacc_EOF. Bison has builtin support for EOF, which you can see
by the "hidden" rule 0: "$accept: start $eof" where start is the
grammar start symbol.
The start symbol of bash's grammar is inputunit (non recursive).
Which has:
inputunit:
error yacc_EOF
{
/* EOF after an error. Do ignoreeof or not. Really
only interesting in non-interactive shells */
global_command = (COMMAND *)NULL;
if (last_command_exit_value == 0)
last_command_exit_value = EX_BADUSAGE; /* force error
return */
handle_eof_input_unit ();
if (interactive && parse_and_execute_level == 0)
{
YYACCEPT;
}
else
{
YYABORT;
}
}
| yacc_EOF
{
/* Case of EOF seen by itself. Do ignoreeof or
not. */
global_command = (COMMAND *)NULL;
handle_eof_input_unit ();
YYACCEPT;
}
;
The first use case is error recovery at EOF. This can be left as
it by just remove yacc_EOF. The second usage is just to have a final
action run at the end of the parsing. That could be done with another
start symbol that is there just for the action. Bison could provide
support for a final action, attached to rule 0.
Anyway, both can be done with yacc_EOF being attached to the real
EOF (i.e., by associating yacc_EOF to 0). As a matter of fact, a
friend of mine does exactly that:
%token ENDOFFILE 0 "end of file"
%%
aut: aut-1 { res.h->loc = @$; YYACCEPT; }
| ENDOFFILE { YYABORT; }
| error ENDOFFILE { YYABORT; }
| error aut-1
{
error(@1, "leading garbage was ignored");
res.h->loc = @2;
YYACCEPT;
}
see here:
https://gitlab.lrde.epita.fr/spot/spot/blob/next/spot/parseaut/parseaut.yy
There is also
simple_list_terminator: '\n'
| yacc_EOF
;
but simple_list_terminator is used as a simple terminator for the
start symbol, so just removing yacc_EOF should be just fine (i.e.,
leave an empty RHS here).
list_terminator:'\n'
{ $$ = '\n'; }
| ';'
{ $$ = ';'; }
| yacc_EOF
{ $$ = yacc_EOF; }
;
This is weird. I don't fully grasp what the point is.
So based on what I saw, I'd say that you should really try to define
yacc_EOF as 0, and remove the useless references to yacc_EOF. It
should work as expected.