bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: #line 0.000000oo


From: Akim Demaille
Subject: Re: #line 0.000000oo
Date: 14 Nov 2002 11:12:31 +0100
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Honest Recruiter)

| > From: Akim Demaille <address@hidden>
| > Date: 13 Nov 2002 10:57:25 +0100
| > 
| > exp: exp '+' exp
| > {
| >    /* Something. */
| > #line 123 "foo.y"
| >    $$ = $1 + $4;
| > }
| > 
| > should complain in foo.y:123.  But it is clear that the #line must be
| > forwarded inside the action.
| 
| If that's the case, Bison is currently mishandling #line.

It is definitely not complete!

| E.g., for this:
| 
|   %%
|   exp: '+'
|   #line 1000 "foo.y"
|   {
|      $$ = $1 + $4;
|   #line 2000 "foo.y"
|      $$ = $1 + $4;
|   };
| 
| the Bison output is currently this:
| 
|   foo.y:1000.1-1001.15: integer out of range: `$4'
|   foo.y:1000.1-1003.15: integer out of range: `$4'
| 
| so the second "#line" directive did not affect the Bison diagnostics.

Yes, I know :)  It's in INITIAL only now.

| > Now, you seem to suggest that once the '}' closed, we should restore
| > the previous location context,
| 
| No, I wasn't proposing that.  All I'm saying is that either #line
| should consistently affect Bison diagnostics, or it should
| consistently be ignored by Bison (other than passing it through to the
| C source code).

So we agree then.


| But in that case, we should use "%line" uniformly for Yacc line
| numbers, to go along with the future "%if" and "%endif".  This makes
| it clearer to users that there are two preprocessors here.  We should
| just pass #line through in C code, as we do now, and make it an error
| to use #line outside of C code (just as it's an error to use #if
| outside of C code).  The "%line" notation makes a lot more sense to me
| than the current approach, which uses the same #line notation to mean
| two quite different things.

Well, you have me *almost* convinced, but not completely.  I think it
would be a wrong message to send to the user that %inline behaves like
a %directive.  It does not.  It is very orthogonal.  You can well
write

%token FOO %line 123 "123" BAR

i.e., %line is really immaterial to the grammar itself.  It's another
layer, and as such, I tend to think that it is a good thing to move it
elsewhere: #.

As for %if/%endif, it is yet very unclear to me if it's a scanner
issue, or grammar issue.  Vadim Maslov says it should be scanner,
i.e., as transparent as #line.  I don't know.  It's very different.
For one thing, it should see the %define too.


| > I'm really against supporting tortured syntax which only concrete
| > use will be the test suite.
| 
| It's OK to implement a subset of the C rules, so long as we document
| whatever we implement.  But the current subset is pretty restrictive,
| and it will make it a bit of a pain for humans to use %line. 

I'm really willing to be convinced, I am not trying to force my ideas
here.  But I do think that ``the current subset is pretty restrictive,
and it will make it a bit of a pain for humans to use %line. ยดยด is an
overstatement.  I must be missing something, but it seems to me that a
very ridig `#line INT STRING' is not that hard to follow.


| For example, currently trailing white space is not allowed in #line;
| nor is the short form supported.
| 
| If we implement %line ourselves, how about if we use the following
| subset of the C rules instead:
| 
|   * Comments are not allowed.
|   * Newline (including backslash-newline) is not allowed.
|   * There is no preprocessor-like expansion (e.g., you can't say
|     `%line __STDC__ "foo"' as you can in C).

I completely miss where the sheer fact of using the same keyword as
CPP forces us into its arcane stuff.  I am very happy with your
paragraph, but don't understand why it does not hold with `#line'.


| Another way of putting it is that we alter Bison as follows:
| 
|   * Arbitrary white space is allowed between tokens.
|   * The file name must have properly escaped quotes.
|   * The file name is optional, and defaults to the previous file name.
| 
| In other words, the regular expression changes from this:
| 
|   ^"#line "{int}" \"".*"\"\n"
| 
| to this:
| 
|   ^{w}*"%line"{w}+{int}{w}*("\""(\\.|[^\n\"\\])*"\""{w}*)?"\n"
| 
| where {w} is any horizontal white space character ([ \f\t\v]).
| 
| If we play our cards right, we won't need to parse the escape
| sequences in that string: just print them as-is in error messages
| (without quoting them), and escape M4 characters like [ and ] when
| sending the string via M4 to C code.

:) :) :)

I'm fine with this implementation.  But I'm still reluctant onto %line
vs. #line.  They do not belong to the same worlds.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]