help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance impact of "redundant" rules


From: Sylvain Schmitz
Subject: Re: Performance impact of "redundant" rules
Date: Wed, 15 Mar 2006 01:24:13 +0100
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.7) Gecko/20040616

Hans Aberg wrote:
On 7 Mar 2006, at 18:42, Sylvain Schmitz wrote:

Too many states means less chances of merging concurrent stacks.  One
could also point out that less inadequate states means less branching,
but the number of inadequate states avoided when using LR(1) instead of
LALR(1) is not very big, whereas the difference in automaton size is
exponential.  Thus LR(1) is much more practicable if you merge states.

My guess is that this exponentiality of states refers to the difference between LR(0) and LR(1).

It does.

There is a similar exponentiality in the NFA to DFA translation. Then there are two methods around this problem: the common, ignoring the issue, as typical regular expression grammars do not hit this exponentiality. Alternatively, one can make a runtime NFA-DFA translation, caching the states, though I have never seen such an implementation.

So, as for LR(1), I guess for typical grammars there will be a blow- up a couple of times of the number of states. If there is a problem with exponential blowups, one might perhaps try the same approach, make a grammar to non-deterministic automata translation, with a runtime component computing the final deterministic push-down automaton states at need, caching them.

I have never seen this kind of implementations either. I can see one reason for that: when we construct the deterministic handle finding automaton, we also get information on whether the LR parser is deterministic or not, and on where the conflicts are.

One could perhaps use a LR(1) testing algorithm, which is supposed to be much less costly than the complete parser generation, to check whether the determinization will work or not---I don't know enough about these tests to tell whether they also give you where the conflicts occur---and do the NFA to DFA translation on parsing time like you suggest.

A second reason is simply that, for purely parsing purposes---that is, ignoring the advantages you already mentioned, like correct prefix + lookahead---, we do not need the full-blown LR(1) automaton; its compressed form works just fine; even the weaker LALR(1) parser is good enough in most cases. The space constrains experienced in the days where LR parsing was invented did not leave much choice on the matter.

As for Bison, I think it should at some point in the future have a LR(1)
feature, so that one at least can experiment with it.

Indeed, you now have the choice.

--
Regards,

  Sylvain






reply via email to

[Prev in Thread] Current Thread [Next in Thread]