guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU Guile PEG-parser


From: Noah Lavine
Subject: Re: GNU Guile PEG-parser
Date: Thu, 9 Feb 2012 10:42:37 -0500

Hello,

> I've actually found no PEG library that has a string syntax for the
> equivalent of ignore. I'm guessing most people are satisfied with just
> specifying another nonterminal and matching that one. Probably because it is
> seen as less ugly than extending on the formal definition of PEG but I
> really think we could get a cleaner PEG definition of our parser if we where
> able to ignore text that wasn't needed or gets in the way while using
> string-patterns.

That makes sense. I'm a bit surprised that you find string patterns
easier than S-expression patterns, because I find it the other way
around, but different people like different things. I think we could
add some string syntax for ignore if you wanted it, although other
people on the list should chime in.

> It's actually exactly Python I'm thinking about, we are currently doing a
> preprocessor that will put #{ and #} before and after each block but I was
> hoping that there exists a cleaner solution using the power of PEG instead
> of basic string manipulation. If you could help in any way shape or form
> that would be greatly appreciated, even just suggesting on what parts of PEG
> internals to look at would be really useful.

After thinking about it more, you have two choices.

The easiest thing would be to parse each line (or extended line, if it
ends with "\") with a PEG parser, and use your own logic for the
blocks. Your parser would have two steps for each line:

1. Get the indent from the beginning of a line
2. Parse the rest of the line with a PEG parser

Then you would take the lines returned by the PEG parser and combine
them into a data structure yourself, using the Python line-combining
rules. This is probably your best choice.

Your second choice depends on the fact that PEG parsers are just
functions that take certain arguments and return certain arguments.
You can write a function like that yourself and use it just like a PEG
nonterminal in your grammar. When I was working on PEG, I actually
thought that it would be nice to make this interface public so that
different parser generators could interoperate, but I never did it.
It's all documented in the PEG Internals section of the manual,
though. However, I'd recommend against this just because I think the
interface is not as good as it should be right now, so I'd probably
want to change it in the future, which would make your code stop
working. (Although if this is a one-time thing, then you don't need to
care about that.)

I suppose you also have a third choice, which is to change the
internal interface yourself, then let us make it public, then use it
that way. That's the most elegant solution, but it's more work for
you. I wouldn't recommend it unless the first option is hard and you
want this to last for a long time.

I hope this helps,
Noah



reply via email to

[Prev in Thread] Current Thread [Next in Thread]