help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to make Bison disregard tokens?


From: Edsko de Vries
Subject: Re: How to make Bison disregard tokens?
Date: Fri, 2 Mar 2007 14:04:03 +0000
User-agent: Mutt/1.5.9i

> To me this seems like  the same problem as preserving comments in a
> parse-process-generate system (which is a pain in the ass).

That's exactly what it is :) 

> You could have the lexer tally empty lines and set that count as a
> property of the semantic value of "real" tokens.
>
> {snip}

Yes, I had been thinking of doing that, and it would be relatively
straightforward to do. It doesn't work quite as one would hope together
with comments, however. phc uses a heuristic to figure out what node in
the AST a comment should be associated with. Basically, if a comment
starts a line, it is associated with the token to follow, but if the
comment is on the same line as another token and after a semi-colon, it
is associated with the previous. An exception to the first rule is that
if a comment starts a line, it is associated with a NOP instruction
instead, so that

if(...)
{
        blah
        // TODO: must do something else here
}

f();

the comment is not associated with the call to f, but with a NOP
instruction which is the last instruction in the body of the if. That
also works if the body of the if is completely empty, or if the comment
is the very last thing in the file.

Now when we combine it with the empty lines problem, suppose we have something 
like

function g()
{
   code 1
        code 2
        code 3

        // TODO: I need to do something here

        code 4
        code 5
        code 6
}

At the moment, what happens is that the comment gets associated with
"code 4", and the blank lines are ignored entirely:

function g()
{
   code 1
        code 2
        code 3
        // TODO: I need to do something here
        code 4
        code 5
        code 6
}

If we do what you suggest, then we mark "code 3" as having three
(including the line of the comment) blank lines after it (or,
equivalently, code 4 as having three blank lines before it), but we
still associate the comment with "code 4":

function g()
{
   code 1
        code 2
        code 3

        
        // TODO: I need to do something here
        code 4
        code 5
        code 6
}

If we insert a NOP instead for blank lines, then the comment gets
associated with the NOP, and we get exactly what the user wrote. 

Incidentally, the same problem also arises with Henrik's location
tracking solution.

Edsko




reply via email to

[Prev in Thread] Current Thread [Next in Thread]