Re: Non-greedy wildcard possible? (Long)

help-bison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-greedy wildcard possible? (Long)

From:	Magnus Lie Hetland
Subject:	Re: Non-greedy wildcard possible? (Long)
Date:	Wed, 19 May 2004 19:11:31 +0200
User-agent:	Mutt/1.4i

Frank Heckenbach <address@hidden>:
>
[snip]
> Of course, in a CFG grammar, you could define rule endings in CFG
> terms even (say, markup is ended by a correctly formed arithmetic
> expression, while a partial expression is considered as plain text
> -- which is probably completely crazy ;-).

Yes, that's my original request (hence the subject of this thread) but
it seemed quite hard to get people to understand what I meant :-]

Now I've basically abandoned this (I think) in favor of the
LL(1) wildcard thing.

> In this case I suppose dprec is your only hope (parse an expression
> and a sequence of tokens in parallel, give the expression dynamic
> precedence, so the sequence of tokens applies only if the expression
> fails to parse),

That's what I thought -- I just couldn't get it to work, which is why
I asked on this list (and haven't really gotten any replies about that
yet ;)

> but if it can be avoided, I'd rather stay away from it. (I don't
> think such languages would be very intuitive to use either ...)

No, I think you're right.

> For wildcards (i.e., regexps AIUI), you could at least in theory
> construct the negated wildcard (since regexps are closed under
> complement, unlike CFGs) for the non-terminator, in order to parse
> unambiguously (i.e., without dprec), but I'm not sure you'd want to
> do that (NFA, DFA, regex, CFG), unless they're simple enough ...

Simply searching for any of the legal terminators works well enough.
The problem, then, is analyzing the CFG to discover the legal
terminators; and as long as that is done statically, it seems the set
of legal terminators may be overestimated.

[snip]
> > That would be up to the user -- who would quite likely also have to
> > specify some way of breaking the ambiguity. My current parser is an
> > LL(1)-like parser that tries things top-to-bottom, making things
> > defined earlier in the grammar trump those defined later.
> 
> With some luck you can translate this into dprec rules, but since
> this would be a real ambiguity, this is the best you can expect.

I guess you're right.

[snip]
> If this works, fine. If, however (I don't know if the grammars
> really get *that* strange, but from your examples so far, they might
> ;-)

Hehe.

> whether a block is a heading (and not, say, a list item) might
> depend on the following block, and parsing is different for
> headings, this decomposition might not work.

True. But I could, for example, use the presence of an initial number
to classify a block as a "numbered block", and then have a CFG further
classify it as a header or a list item, all without going into a  CFG
parse of the block contents.

It seems that this kind of approach might reduce the load on the
parser enough that a simple hand-coded LL(1)-like parser generator
would be fast enough (which has really been my problem).

In order to be more robust in the face of user error and non-LL(1)
features, I've even started thinking about simply using backtracking.
If the grammar is *mostly* LL(1) that shouldn't mean exponential
running times any more than the simple GLR implementation, I suppose?
It would, at least, make the implementation that much simpler -- "when
in doubt, use brute force" :]

[snip]
> I don't know about this one. Your description looks *a bit* like
> bison's error recovery. If you get to the sematic values of the
> skipped tokens, it may help, but I don't really know ...

I think GLR* is a bit more than simple error recover -- it even
includes a beam search heuristic for searching for the parse which
skips the least number of tokens.

But I don't think it's realistic for my purposes anyway (it's not even
really deterministic -- at least from a practical point of view -- it
seems).

> Frank

-- 
Magnus Lie Hetland              "Wake up!"  - Rage Against The Machine
http://hetland.org              "Shut up!"  - Linkin Park

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Non-greedy wildcard possible? (Long), (continued)
- Re: Non-greedy wildcard possible? (Long), Hans Aberg, 2004/05/16
  - Re: Non-greedy wildcard possible? (Long), Magnus Lie Hetland, 2004/05/16
    - Re: Non-greedy wildcard possible? (Long), Hans Aberg, 2004/05/17
    - Re: Non-greedy wildcard possible? (Long), Magnus Lie Hetland, 2004/05/17
    - Re: Non-greedy wildcard possible? (Long), Hans Aberg, 2004/05/18
    - Re: Non-greedy wildcard possible? (Long), Magnus Lie Hetland, 2004/05/18

Prev by Date: Re: Non-greedy wildcard possible? (Long)
Next by Date: Re: Non-greedy wildcard possible? (Long)
Previous by thread: Re: Non-greedy wildcard possible? (Long)
Next by thread: Re: Non-greedy wildcard possible? (Long)
Index(es):
- Date
- Thread