help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Is it always possible to make a non-reentrant parser reentrant?


From: Peng Yu
Subject: Re: Is it always possible to make a non-reentrant parser reentrant?
Date: Fri, 8 Feb 2019 17:21:35 -0600

Hi Simon,

> Normally, you'd use lexer states to activate/deactivate rules. The
> primitive approach would be
>
> %x INITIAL HEREDOC

I see %x is from flex. Bash can support nested heredoc. How can it be
implemented in flex?

> and then prefixing all matches with <INITIAL> or <HEREDOC>.
>
> The main problem there is that state changes need to be driven by the
> lexer code, as the BEGIN macro is only available there, so a change from
> the parser would have to be communicated through yyextra, and applied in
> the lexer code before matching a token (so YY_USER_ACTION is too late).
>
> The other thing is that parsing heredocs with the lexer is rather
> pointless, as the only thing we are interested in is dynamic anyway, so
> grabbing the data out of the lexer stream with a custom function is
> probably the better approach.
>
> Some people use tar files as heredocs, so a "[a-zA-Z]*" rule can match
> really long strings there, which the lexer would have to extend its
> buffer for in order to provide yytext/yyleng. We can't limit the match
> length either because then we'd have to jump through a lot of hoops to
> match the end tag if it is straddled across two matches
>
> If the lexer can identify heredocs reliably, then it's probably best to
> let it provide a token HEREDOC to the parser after setting up the state
> for heredoc parsing (which may live in yyextra to make it reentrant, but
> that's orthogonal), and the parser then calls a special function to
> retrieve the heredoc from the lexer's stream. That function would live
> in the lexer source file so it can request more characters from the
stream.
>
> Another option I could see would be to have the lexer return fragments
> of the heredoc, and just repeat the token as long as there is data —
> this would also avoid having to read the entire stream into memory, and
> keep the interface between lexer and parser down to yylex().
>
>    Simon
>
> _______________________________________________
> address@hidden https://lists.gnu.org/mailman/listinfo/help-bison



-- 
Regards,
Peng
-- 
Regards,
Peng


reply via email to

[Prev in Thread] Current Thread [Next in Thread]