help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: enum instead of #define for tokens


From: Akim Demaille
Subject: Re: RFC: enum instead of #define for tokens
Date: 03 Apr 2002 19:13:05 +0200
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Common Lisp)

| > From: Akim Demaille <address@hidden>
| > Date: 03 Apr 2002 15:56:47 +0200
| 
| > It does look good, but would some people really use it?  Now that I
| > know we are bound by POSIX to #define, I'm tempted to leave it as is
| > for C, and provide something better only for C++.  What is your opinion?
| 
| The main advantage of having an enum is that the enum constants
| are visible to GDB, so that one can type, for example:
| 
|   (gdb) p var = FOO
| 
| and assign FOO to a variable.  This does not work with #defines.

I'm aware of this point, and that was the main motivation for enums.
But as for the last point, this is no longer true: I have seen patches
from Jim Blandy to enable macros debugging in gdb.  It's just a matter
of time.

        http://gcc.gnu.org/ml/gcc/2002-03/msg00914.html

| It would be even nicer if the enum type were used when possible, with
| internal variables for example, of course continuing to use 'int' for
| the existing interfaces.  That way, when you use GDB to print an
| internal variable, it will print as FOO and not as 257.
| 
| This nicety is not essential, so I don't think it should be put into
| the maintenance branch.  But it would be useful to me when debugging a
| parser.  Also, the nicety should be used only when the compiler is not
| overly pedantic about int versus enum, so it should be controlled by a
| preprocessing symbol to disable it.  I would enable it by default if
| __STDC__ is defined.
| 
| A nice little project for someone who has the time....

Actually, given the M4 backend, it's dead easy.

| Two more topics are related to this one.
| 
| 1.  Currently Bison consistently uses 'short' for some variables, even
| though the values might not fit into 'short'.  For example, yyr1 is
| declared to be of type short, but it should be the enum that we're
| talking about, in case the user declares more than 32768 - 256 - 1
| tokens.  There are several other instances of this problem not related
| to the enum; for example, line numbers do not always fit into 'int' on
| common 32-bit hosts.  It will be a pain to fix all the problems, I'm
| afraid.

At home, I'm working on moving the engine from using shorts everything
as indices into arrays to using actual pointers.  As a result, the
engine is getting more and more explicitly typed.  Part of the aim of
this partial rewriting, is exhibiting the natural support that should
be given to the various variables.

At the end, I expect to be able to move Bison to bigger grammars much
more easily. 

In other words, it's in the pipe.


| 2.  Currently Bison assumes 8-bit bytes (i.e. that UCHAR_MAX is 255).
| It also assumes that the 8-bit character encoding is the same for the
| invocation of 'bison' as it is for the invocation of 'cc', but this is
| not necessarily true when people run bison on an ASCII host and then
| use cc on an EBCDIC host.  I don't think these topics are worth our
| time addressing (unless we find a gung-ho volunteer for EBCDIC or
| PDP-10 ports :-) but they should probably be documented somewhere.
| 
| Whew!  I'd better stop discursing....

My position, probably not very nice, is that this should not happen.
Passing chars (wchars) as tokens is wrong.  It was a nice little dirty
trick to be able to `return '+'' in the scanner, and use '+' too in
the parser, but that's not sane.  The parser should never see
characters.

As a result, there is no such issue as a Unicode compliant parser.


| > I think I was wrong when I said that the trick you implemented for
| > 1.3x was not to be installed in 1.5x.  It should probably be applied
| > there too.
| 
| Sorry, which trick was that?  I've installed so many lately.  Or
| perhaps you mean the cumulative set of tricks for C++?

I think only the latest was not installed: the one that removes the
ability of growing the stack when %union is not used.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]