help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-greedy wildcard possible? (Long)


From: Frank Heckenbach
Subject: Re: Non-greedy wildcard possible? (Long)
Date: Mon, 17 May 2004 19:56:39 +0200
User-agent: semail 20040101

Laurence Finston wrote:

> > Sure, that's not a problem. What it means semantically isn't fixed,
> > though. (Just think of binary vs. unary minus, for example.)
> 
> Binary vs. unary plus and minus are context-dependencies.  In order to make
> them work in Bison, it's necessary to use the `%prec' feature, as described in
> the manual.

No, it's not:

- You can write a LALR grammar containing binary and unary +/-
  without `%prec' easily.

- This still holds if you use precedences in the grammar (say, for
  * vs. +/-).

- Only if binary and unary +/- have different precedences (which is
  so in certain computer languages such as C, not in other languages
  such as Pascal or in mathematics), you *can* use `%prec' for this
  purpose.

- However, you can also use other means, reordering the grammar
  rules and nonterminals, so you don't have to specify precedences
  at all.

- In any case, all these are context-free grammars.
  Context-sensitive grammars are a different and ugly beast
  altogether.

> I've found that it doesn't really pay to use Flex when one is using Bison,
> since in this case, the scanner mostly just returns tokens to yyparse().  

Uhm, well, that's the purpose of a scanner, isn't it?

To me the main advantage of (f)lex is the ability to match several
regexes in parallel. If done one-by-one using a simple regex
facility, it would be much slower. Of course, if R1, R2 ... Rn are
your token regexes, you could match `(R1)|(R2)|...|(Rn)' and check
the parenthesized expressions (\1, \2, ...) to find out which one
matched. That's roughly what (f)lex does internally.

> > The documents I'm trying to parse can certainly be described
> > with context-free grammars (I've tried to explain how, but I guess I
> > haven't been quite clear). The problem is either (1) analyzing the
> > input-grammar in order to produce a wildcard that doesn't contain any
> > relevant terminating tokens, or (2) making Bison handle an ambiguous
> > grammar.
>
> An ambiguous context-free grammar is invalid.

Not at all. General CFGs can be vastly ambiguous. In LALR parsing
you have to deal with ambiguities. GLR parsing, OTOH, can deal with
arbitrary CFGs, though at the cost of possibly exponential time and
space if the ambiguities are not localized.

Frank

-- 
Frank Heckenbach, address@hidden
http://fjf.gnu.de/
GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]