help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A Query in Tokenizing input


From: Hans Aberg
Subject: Re: A Query in Tokenizing input
Date: Sat, 21 Jun 2003 11:20:27 +0200

At 08:38 +0100 2003/06/21, praveen thakur wrote:
>I am writing a Lexer/Parser (using flex++/bison++)...

This is the Bison list and the Flex list is
  Help-flex mailing list
  address@hidden
  http://mail.gnu.org/mailman/listinfo/help-flex
So if you have problems with those very old ++ versions of Flex/Bison,
these are not the right places.

>... in
>which I want
>to differentiate some keywords from STRING at the
>parser level.
>
>My lexer file is like this...
>
>NUMBER ([0-9]+)(\.[0-9]+)?
>STRING [_A-Za-z][_A-Za-z0-9\.\\\/]*
>%%
>
>{NUMBER} {(parser.yylval).Int=(int)atoi(yytext);return
>
>Parser::NUMBER;}
>{STRING} {strcpy((parser.yylval).Str,yytext);return
>Parser::STRING;}
>
>. { return *yytext;}
>

>I am using STRING at several places and I want some
>keywords
>(like "ta" above ) differentiated from the STRING
>based on the
>production rule. Can this be achieved ??

This is really a Flex question. There are three basic methods that can be
used (see the Flex manual or use the Help Flex list for more info):

1. You can add another Flex rule:

"keyword_A" { return KEYWORD_A; }
{STRING} {strcpy((parser.yylval).Str,yytext);return Parser::STRING;}

The Bison .y file might then contain
  %token KEYWORD_A "keyword_A"
and you can use both KEYWORD_A or "keyword_A" in the grammar. This way, you
can replace all single character 'x' constructs with proper token names.

2. You might add a tweak in the {STRING} action which examines the string
and then returns the appropriate token value. This is especially suitable
when identifiers are assigned token type by a declaration. One can then
store its token type in a lookup table, and let the lexer return that to
the parser.

3. One can also use Flex "start conditions", which are suitable when the
language needs context switches, for example if some keywords are valid in
only some contexts.

You probably only need 1 for a start, but as the language grows more
complex, you are likely to need 2 and 3 as well.

  Hans Aberg






reply via email to

[Prev in Thread] Current Thread [Next in Thread]