[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Regular Expression String Search With Bison
From: |
Ricardo Grant |
Subject: |
Regular Expression String Search With Bison |
Date: |
Tue, 12 Mar 2019 02:07:26 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 |
Hello,
I am struggling to understand bison, and parser in general, so I hope I
can get some help. To understand better I decided to try to create a
program the understands simple regular expressions, and is able to show
correct sub string matches. Here is the grammar for the language I would
like to make:
<regex> ::= <term> '|' <regex>
| term
<term> ::= <term> <factor>
| <factor>
<factor> ::= <base> '*'
| <base>
<base> ::= '(' regex ')'
| char
I have attempted to make smaller code to explain my issue, since using
words alone is a bit difficult. Some of it may be nonsensical:
%{
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#define BUF_MAX 31
int yylex (void);
void yyerror (char const *);
char regxp[BUF_MAX];
char text[BUF_MAX];
char input[BUF_MAX]
int i = 0;
struct regxp
{
bool star;
char str[BUF_MAX];
};
%}
%parse-param {char* regex}
%define api.value.type union
%token <char> CHAR
%type <struct regxp> regex term factor base
%%
input:
input line
| %empty
;
line:
regex '\n' { printf ($1.str); }
| error '\n' { yyerrork; }
| '\n'
;
regex:
regex term { strncpy ($1.str, $2.str); $$ = $1; }
| term { $$ = $1; }
;
term:
term factor { strncpy ($1.str, $2.str); $$ = $1; }
| factor { $$ = $1; }
;
factor:
base '*' { $$ = $1; $$.star = true; }
| base { $$ = $1; sscanf (input, "%c", &$1.str) }
;
base:
'(' regex ')' { $$ = $1; }
| CHAR { input[++i] = $1; }
%%
int
main (int argc, char const **argv)
{
if (argc < 3)
return 1;
else
strncpy (argv[2], regxp[], BUF_MAX);
return yypase (argv[1]);
}
What exactly happens for the semantic variables, especially $$? My
smallest element is a character, but I have to idea how to go from a
character to a base token in code. So I think there will have to be some
way of passing information around to the string in the regexp type.
In my main section, I just want to get a regular expression and a string
to test against, like ./prog ab* ababb.
- Regular Expression String Search With Bison,
Ricardo Grant <=