help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to make Bison disregard tokens?


From: Edsko de Vries
Subject: Re: How to make Bison disregard tokens?
Date: Fri, 2 Mar 2007 12:39:55 +0000
User-agent: Mutt/1.5.9i

On Fri, Mar 02, 2007 at 01:17:04PM +0100, address@hidden wrote:
> On Friday 02 March 2007 13:01, Edsko de Vries wrote:
> 
> > Well, no, that's the point. Blank lines should be recorded as NOPs where
> > possible, so that we know where they are and unparse them. 
> Can you reveal some details regarding what you are trying to do ?

I am working on the front end of phc, an open source framework for
working on PHP scripts (and hopefully ultimately compiling PHP to native
machine code-we're getting there).

The front end of phc builds an abstract representation of PHP scripts.
The structure of the AST is described by the phc grammar
(http://www.phpcompiler.org/doc/grammar.html). We have defined a
transformation and visitor API, so that it easy to write programs that
operate on PHP scripts in some way. 

One such program is an unparser that takes a PHP script in our abstract
representation, and dumps it back to PHP syntax (a pretty printer). This
is useful because after transforming your PHP script (for example, doing
some refactoring), you may want to output it again and run it.

The phc unparser tries to make this output as readable as possible. For
example, it generates indentation, etc. Since we want this unparser to
be useable, we have some stricter than usual constraints on the lexer
and parser. For example, comments cannot simply be thrown away; nobody
would take a refactoring tool for PHP seriously if it throws away all
your comments! The lexer and parser therefore work together to associate
comments with nodes in the tree. 

Another issue is whitespace. Most whitespace is ignored by the lexer, and the 
parser can therefore not reproduce it. For example, if you write

$a = $b   .     $c;

it will be unparsed as

$a = $b . $c;

We are planning to write an unparser that is able to copy the user's
layout exactly, but for the moment, this isn't too big a deal. However,
it is not acceptable that the unparser throws away all empty lines,
also. For example, the user might write

function f()
{
        prologue
        prologue
        prologue

        function body
        function body
        function body

        epilogue
        epilogue
        epilogue
}

where the programmer explicitely inserts blank lines to indicate
semantic units within the function definition. We should be able to
reproduce these blank lines.

Now, one solution is so record these blank lines as "NOPs" (no-ops; null
statements) and unparse null statements as blank lines. Hence my
question. I was hoping it would be easy to do this in Bison, but if that
isn't the case, I should look for a different solution.

Edsko




reply via email to

[Prev in Thread] Current Thread [Next in Thread]