help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Somewhat newbie question: multiple lines in one token?


From: Hans Aberg
Subject: Re: Somewhat newbie question: multiple lines in one token?
Date: Mon, 17 Jun 2002 19:49:33 +0200

Reply-to: address@hidden

At 00:18 +1000 2002/06/18, Robert Collins wrote:
>I've built a flex .l file to recognize foo:, bar: and multipleline:, and
>to grab everything else as a single STRING.
>
>What I've having trouble with is the multipleline: construct. I've tried
>varying things:

I suggest you to think over what token quantities you may need, and then
make sure that your Flex lexer sends them to the Bison parser.

So if newlines as well as multiple newlines are used to denote line and
paragraph separators, you may try using a Flex .l file rules like:

"\n"     { return NEWLINE; }
"\n"+    { return PARAGRAPH; }
<<EOF>>  { return LEX_EOF; }

Then in the Bison .y file, you would have something like:

%token NEWLINE PARAGRAPH LEX_EOF

%%
non_terminated_line:    /* A non-empty list of my_favourite_tokens */
    my_favourite_tokens
  | non_terminated_line my_favourite_tokens

paragraph:
    NEWLINE     /* Empty line */
  | PARAGRAPH   /* Empty paragraph */
  | LEX_EOF     /* Empty file */                    { ... YYACCEPT; }
  | NEWLINE LEX_EOF  /* File with only NEWLINE*/    { ... YYACCEPT; }
  | PARAGRAPH LEX_EOF /* File with only PARAGRAPH*/ { ... YYACCEPT; }
  | non_terminated_line NEWLINE   /* Line followed by another line */
  | non_terminated_line PARAGRAPH /* Line at end of paragraph */
  | non_terminated_line LEX_EOF   /* Line at end of file */ { ... YYACCEPT; }
  | paragraph non_terminated_line NEWLINE
  | paragraph non_terminated_line PARAGRAPH
  | paragraph non_terminated_line LEX_EOF { ... YYACCEPT; }

Or something. -- One should be able to achieve this by fiddling around a
bit, I think. You can then perhaps find a solution that is somewhat more
structured than my suggestion. But first find something that works.

I wanted to cover up the possibility that the EOF is used as a end-marker
for lines/paragraphs, so therefore I gave it an explicit token name in the
.l file. Then I have to make sure that the parser terminates properly when
this token arrives, by inserting { ... YYACCEPT; } in those actions.

  Hans Aberg





reply via email to

[Prev in Thread] Current Thread [Next in Thread]