[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bison generating C++
From: |
Wayne Green |
Subject: |
Re: Bison generating C++ |
Date: |
Wed, 18 Oct 2000 16:12:30 -0600 |
Howdy,
Its about time I jumped in. We have a C++ DLL (Microsloth environment) that
supports at least 8 different ASCII files implementing persistence for its
internal data structures. The ASCII files support user's reading and are
subject to user editing. This means the use of a grammar based parser allows us
to support scientific notation for the contents (typesafe interface for the
user), a way to help guarantee input values are reasonable, and the ability to
tell the user where and what errors they made and to reduce their frustration
fixing (m)any changes they make.
I started with Sandstone's Visual Parse++, and found that it was broken. The
author still has not returned my calls or email. I moved on to MKS's Lex/Yacc
product and found that it did not fully support multiple parsers and scanners
in one program to the degree that I wanted and that it did not support C++
IOStreams -- a necessary condition for our program. IOStreams allows a
programmer to supply a string (stringstream), a file stream (ifstream) or a
network connection for interprocess communications.
It was fairly easy to change MKS to support IOStreams. I added a superclass to
the parser and scanner to allow a programmer to supply instance variables for
use in Lex/Yacc actions. I derive a class by inheriting the parser/scanner that
provides a way of associating a scanner and tying to an input stream. This
required hacking the prototypes and adding a little PERL based post-processing
to achieve results. Extending the prototype interface does away with needing a
derived class at the program level.
Based on my experience, I think:
1) Bison needs an include file for the %token list, to be shared with the
scanner.
2) The scanner needs to "own" the token list, since it sources the tokens in
the first place -- it should define what they are.
3) A keyword management package needs to be supplied with the scanner (as a
toolkit item) to simplify the scanner. I use a reference to the keyword manager
that ties to it's list of words.
4) Yacc needs to publish a .h file that only includes the results of the %union
typedef. I make this in my post-processing step and it clears up a circular
dependency between the yacc.h and lex.h output files as far as YYSTYPE is
concerned. prefixYYSTYPE.
5) For MKS at least, the prefix mechanism was broken, requiring post-processing
to guarantee all the variables got the right names. The super-class needed
post-processing to fix up the prefix as well. There's not much to the
post-processor.
I had started out with Bison, but found it too far behind to chance slipping my
client's schedule to bring it up to speed. MKS had advertised a solution that
was cheap enough. I still had about 20 hours of additional work to make MKS
work.
I decided to use MKS thinking that it was much better. Hacking on it has given
me the insights to tackle Bison to try for a closed and complete C++ toolkit.
I may be willing to take on a Bison upgrade. The changes would be to extend
Bison to handle C++ and to include the Microsoft platform, while not affecting
the current interface. This should let old users upgrade with no impact on
their build processes, while letting C++'ers get with the program. Microsoft
support would stop at a makefile for the package, a prototype makefile to build
projects with, and some muddled documentation (PDF and emacs info) about the
process.
I will conclude my work this week with MKS. I may move back to Sandstone if I
can somehow will Will to fix the dang bug (missing two entry points for
embedded tables). I think that LALR(n) grammars provide a safer approach
especially if I have to process a legacy input stream for which I cannot change
the token input order. Their GUI front end supports my client's maintenance
needs as future programmers inexperienced with a YACC approach will find
developing their scanners and parsers much easier with Sandstone. The real
down-side of Sandstone is the lack of embedding the parser/scanner actions into
the specification file and having to painstakingly edit the output files for
any subsequent change to parser/scanner. Adapting sandstone to IOStreams and
STL is a big unknown at this time. Bison/Flex mercifully keeps the actions in
the specification file. A friend's company threw Sandstone out because of that.
Adding C++ to Bison should be trivial, except for those pesky global variables.
This will require gathering them up and turning them into member variables for
their classes. The super class is too easy to add to the mix. Getting on to the
newer generation of IOStreams and STL was straightforward as well.
Example time:
class quuxScanner : public quuxSuperScanner { ... }; // from lex
class barParser: public barSuperParser { ... } ; // from yacc
class myDataRef { ... } // to stuff with grammar'ed goodies.
{ if( barParser(quuxScanner("mystream.txt", mydataRef)).parse() != 0 )
throw up("gag");
}
or different parts of same stream with separate grammars/scanners:
ifstream mystream("Mystream.txt",ios::in);
if( Mystream.good())
{if( barParser(quuxScanner(mystream), mydataRef).parse() != 0 )
throw up("gag");
}
long where = mystream.tellg();
char charbuffer[256];
mystream >> charbuffer; // fake token peek
mystream.seekg(where); // rewind...(ugly)
if( strnicmp(charbuffer, sizeof(charbuffer)) == "bazKey")
{
if (bazParser(quuxScanner(mystream), mydataRef).parse() != 0 )
throw up("gag");
} else if (strnicmp(charbuffer, sizeof(charbuffer)) == "fooKey")
{
if (fooParser(quuxScanner(mystream), mydataRef).parse() != 0 )
throw up("gag");
}
Heck, write a separate grammar to handle the switching between foos, bars, and
bazes.
Most of the time, I have found I can get by with one scanner for
many grammars. This makes the need to coordinate %token important.
As for a new name? Well,, bison/flex/lex/YACC might just be called:
Bi-ef-lacc. The cute animal could be that supplementary insurance duck.
--Wayne Green
address@hidden
At 10:25 PM 10/18/2000 +0200, you wrote:
>I have a suggest for Bison generating C++, namely to add an option that
>cause it to write a file with a C++ ending, default say ".cc" instead of
>".c". For now, it would do nothing else, but later one could first change
>to C++ standard headers using namespaces, and even further down the road,
>more C++ support.
>
>I tweaked the Bison sources, to get this, as follows:
>File getargs.c: Add
> int generate_cplusplus = 0;
>and in getargs(), add a "+" here ------>|
> while ((c = getopt_long (argc, argv, "+yvdhrltknVo:b:p:", longopts, (int
>*)0))
>and to the switch statement, add the case '+', as follows:
> switch (c)
> {
> case 0:
> /* Certain long options cause getopt_long to return 0. */
> break;
>
> case '+':
> generate_cplusplus = 1;
> break;
>...
>
>Then in file files.c, add
> extern int generate_cplusplus;
>and in open_files() change to:
> /* These are opened by `done' or `open_extra_files', if at all */
> if (spec_outfile)
> tabfile = spec_outfile;
> else if (!generate_cplusplus)
> tabfile = stringappend (name_base, base_length, ".c");
> else
> tabfile = stringappend (name_base, base_length, ".cc");
>with a statement appending ".cc" if generate_cplusplus is set.
>
>This does not check all combinations C++/calling for special file names. It
>is just a tweak I happened to need.
>
>One can think of a long option --C++, and that these options optionally can
>have preferred filename extensions, one might be something like that -+.cpp
>would make the files have extensions ".cpp" instead.
>
> Hans Aberg
>
>
>
>_______________________________________________
>Help-bison mailing list
>address@hidden
>http://mail.gnu.org/mailman/listinfo/help-bison