[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Clarification on "Infinite parsing loops in IELR + LAC"
From: |
Adrian Vogelsgesang |
Subject: |
Re: Clarification on "Infinite parsing loops in IELR + LAC" |
Date: |
Sat, 29 Sep 2018 14:51:49 +0000 |
User-agent: |
Microsoft-MacOutlook/10.10.0.180726 |
Hi Joel,
Thanks for your detailed explanation! (and sorry for my late reply)
> Do you have a specific use case in mind that requires these features, or
> have you chosen them because of their superiority in theory?
LAC: our parser currently reports incorrect lists of expected tokens for syntax
errors. This should be fixed by using LAC.
IELR: mostly, due to superiority in theory.
I would of course prefer if the parsing tables created by LALR were sufficient
and would be willing to spend the time to adjust the grammar to be
LALR-compatible.
As I understand, Bison introduces the mysterious conflicts *silently*.
Given the choice between incorrect parsing tables and less efficient parsing
tables, I would opt for the less efficient parser tables.
On a separate point:
The grammar I am working on is using the C++ template but LAC is not yet
supported for C++.
I am thinking about contributing LAC support for C++.
I read the parts on syntax error handling from your thesis
(https://tigerprints.clemson.edu/cgi/viewcontent.cgi?article=1519&context=all_dissertations,
section 5.2.2).
I was wondering, if there might be additional pitfalls I should be aware of
when porting the C code to C++?
Also, in my specific implementation, I don't really care about semantic actions
being delayed.
In case of a syntax error, we throw away the semantic values anyway and the
semantic actions don't have any bad side effects.
To my understanding, it should be possible to eagerly execute semantic actions
already during the first, exploratory parse and thereby avoid redoing the parse.
Only if this first parse fails, we go back and extract the list of valid tokens
based on the initial state.
Would such a mode be feasible?
Should I add it to "define parse.lac", maybe as a third option
"expected-tokens-only"?
- Adrian
From: help-bison <address@hidden> on behalf of Joel Denny <address@hidden>
Date: Monday, 17 September 2018 at 05:45
To: akim <address@hidden>
Cc: "address@hidden" <address@hidden>, Adrian Vogelsgesang <address@hidden>
Subject: Re: Clarification on "Infinite parsing loops in IELR + LAC"
Hi Adrian, Akim,
On 09/11/2018 12:47 AM, akim wrote:
> Hi Adrian,
>
> That would be a question for Joel, I don't know the details.
Thanks for alerting me to this.
> Le 2018-09-11 00:03, Adrian Vogelsgesang a écrit :
>> Hi everyone,
>>
>> From what I understand reading the bison documentation and some of the
>> linked publications,
>> I should probably be using IELR in combination with Lookahead-Correction
Do you have a specific use case in mind that requires these features, or
have you chosen them because of their superiority in theory?
>>
>> However, I stumbled across a warning in section 5.8.3 (“LAC”) in the
>> bison documentation
>> (https://www.gnu.org/software/bison/manual/bison.html#LAC
>>> IELR plus LAC does have one shortcoming relative to canonical LR.
>>> Some parsers generated by Bison
>>> can loop infinitely. LAC does not fix infinite parsing loops that
>>> occur between encountering a syntax
>>> error and detecting it, but enabling canonical LR or disabling
>>> default reductions sometimes does.
>>
>> This sounds quite frightening to me and I am not sure if LAC is
>> actually something I should be using.
>> Could someone clarify that part of the documentation for me?
The goal of IELR + LAC is to achieve canonical LR behavior with nearly
LALR-sized parser tables. The point of the statement you quoted is to
clarify that IELR + LAC does not achieve canonical LR behavior for some
grammars in the case of some syntax errors. That statement does not
describe some major flaw that is introduced by switching to IELR or LAC
from a traditional Yacc/Bison-generated parser, which uses LALR without
LAC, and which is less powerful.
>>
>> In particular, I would be interested in the following points:
>> * The documentation only mentions IELR + LAC, but what about LALR + LAC?
Switching from IELR to LALR reduces the set of grammars that can be
recognized. In other words, it takes you even farther from canonical
LR. However, if you have a grammar that LALR is powerful enough to
parse, IELR produces exactly LALR parser tables. In that case, until
your grammar evolves, IELR + LAC = LALR + LAC.
>> * "LAC *does not fix* infinite parsing loops" -> this sounds as if
>> these infinite parsing loops would also exist without LAC?
Exactly.
I suppose it's possible that one could write a semantic action to
terminate the parse during one of these infinite parsing loops on a
syntax error. In that case, because LAC suppresses semantics actions
while detecting the syntax error, LAC would prevent the semantic action
from terminating the parse. However, I don't recall ever encountering
an actual semantic action written for such a purpose.
>> * When exactly do those infinite loops occur? How can I avoid them?
Infinite parsing loops are a known issue in the LR family, including
canonical LR. In my experience, they rarely occur in practical
grammars, even when using LALR. I see them mostly as a theoretical
possibility that occurs in contrived grammars. If you find that they
occur for a practical grammar, please post it here. Perhaps we should
say something like this in the manual after the text you quoted?
In summary: Traditional Yacc/Bison (that is, LALR without LAC) < IELR +
LAC < canonical LR. However, the difference between IELR + LAC and
canonical LR concerns an issue (infinite parsing loops on syntax errors)
that appears to be rare for practical grammars.
Hope that helps.
Joel
>>
>> Best,
>> Adrian
>>
>> _______________________________________________
>> address@hidden https://lists.gnu.org/mailman/listinfo/help-bison
>
_______________________________________________
address@hidden https://lists.gnu.org/mailman/listinfo/help-bison