[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Getting involved in Bison
From: |
Morales Cayuela, Victor (NSB - CN/Hangzhou) |
Subject: |
RE: Getting involved in Bison |
Date: |
Mon, 24 Feb 2020 05:40:02 +0000 |
Hello!
> I recommend that you first build by hand a push parser before trying to get
> this done by m4. For instance take examples/c++/calc++, and change the
> generated parser so that it works as a push parser. Keep this preciously
> somewhere to make sure "make" doesn't kill it. When it works, you can show
> us a diff between the regular pull parser and the push parser, and when we
> agree, we can proceed to make it work in m4.
Ok, I will first create a draft parser, review it with everyone and then make
it work in m4. I will use Java as reference, should be closer to the final
solution than C.
I have been fiddling this weekend with the push parsers and some examples, but
I didn't look at pushcalc. Although I have used a lot Bison I am still
unfamiliar with jargon and some notions. Thanks for all the suggestions and
examples.
> Maybe you should start with that: find a means to benchmark two pull parsers:
> one off-the-shelf, generated by today's lalr1.cc, and another one where the
> local variables that need to be member variables in push-mode are made member
> variables.
What is the expected difference in performance between local and member
variables? Besides constructor creation/destruction and variable lifetime, I
would say there should not be other issues. Anyway let's test both 😊
Related to this topic:
#################
Do you remember Akim that I told you that in my Mac tests did not pass
completely but in yours they did? Specifically all in these categories:
LALR(1) Calculator
LALR(1) C++ Calculator
Since this time will I need them, I decided to checked them. Seems that in Mac
`wc -l` indents the result with the number of lines, mismatching with the
expected pattern:
tests/testsuite.log:
./calc.at:1111: grep -v 'Return for a new token:' stderr | wc -l
--- - 2020-02-22 13:15:01.000000000 +0800
+++
/Users/Victor/Projects/bison/tests/testsuite.dir/at-groups/456/stdout
2020-02-22 13:15:01.000000000 +0800
@@ -1,2 +1,2 @@
-0
+ 0
Has someone previously reported this? Seems I am predestined to deal with
indentations XD
Separate topic about C++:
#####################
I believe we could improve a bit the C++ generated code and standardize it with
modern syntax.
For example, C++ does no longer recommend constructing objects with
parenthesis, but with braces. I saw yesterday this statement:
symbol_type yylookahead (yylex ());
Which should be rewritten as:
symbol_type yylookahead {yylex ()};
There are also some keywords that help the compiler optimizing. For example,
this structure has a default constructor:
struct by_type { /* Default constructor */ by_type (); ... }
We might add "= default" (or even "= delete" if it is not used and we prefer to
avoid it being created without parameters):
struct by_type { /* Default constructor */ by_type () = default; ... }
We can discuss all this after the push parser.
Regards,
Victor
-----Original Message-----
From: Akim Demaille <address@hidden>
Sent: 2020年2月23日 15:08
To: Morales Cayuela, Victor (NSB - CN/Hangzhou) <address@hidden>
Cc: Bison Bugs <address@hidden>; Bison Patches <address@hidden>
Subject: Re: Getting involved in Bison
Hi Victor!
> Le 20 févr. 2020 à 04:07, Morales Cayuela, Victor (NSB - CN/Hangzhou)
> <address@hidden> a écrit :
>
> Hi!
>
>> It really depends what you'd like to do. You reported you're fluent in C++.
>> Something quite interesting would be to enable push-parser in lalr1.cc.
>> It's more ambitious than the subcomplain task, but it will probably take you
>> less time, because it's more compact, it will impact much less code.
>
> Ok, seems very interesting! I use C++ (currently C++17) in my daily job as a
> developer and I always try to keep up with the latest of the standard.
> I will familiarize myself first with push/pull parsers and then I will start
> with it 😊 I will also take a look at TODO to check if there is already any
> specification about how it should look like.
I think all the information you need is in the manual, in the push example in C
(see examples/c/pushcalc), and in yacc.c's implementation of push. Actually,
Java also has push-parser support, and is probably a better source of
inspiration than C.
I recommend that you first build by hand a push parser before trying to get
this done by m4. For instance take examples/c++/calc++, and change the
generated parser so that it works as a push parser. Keep this preciously
somewhere to make sure "make" doesn't kill it. When it works, you can show us
a diff between the regular pull parser and the push parser, and when we agree,
we can proceed to make it work in m4.
Actually, maybe examples/c++/calc++ is too bloated: first create a very simple
example in C++, similar to examples/c/calc. Start from this one, it is much
simpler, and yet sufficient.
Most of the effort, I guess, is finding what local variables must be turned
into member variables, to survive across calls to yyparse (amusingly enough
coroutines would make this easy, but we can't afford them yet ;-). But
lalr1.java should help.
Note that C++ allows to consider other options than the C parsers. Since the
C++ parser is an object, maybe we can have push and pull use the same object.
I mean, *maybe*, we can *always* move most local variables into member
variables, even for regular pull-parsing. Of course it's technically possible,
the "maybe" refers to the runtime cost: *if* benchmarks show that it is not a
regression to move to a model with more member variables and less local
variables, then the m4 part will be much easier.
Maybe you should start with that: find a means to benchmark two pull parsers:
one off-the-shelf, generated by today's lalr1.cc, and another one where the
local variables that need to be member variables in push-mode are made member
variables.
How does that sound?