[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: dlang: initial changes to run the calc tests on it
From: |
H. S. Teoh |
Subject: |
Re: dlang: initial changes to run the calc tests on it |
Date: |
Wed, 27 Feb 2019 22:32:54 -0800 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Tue, Feb 26, 2019 at 06:33:55PM +0100, Akim Demaille wrote:
[...]
> What I did below is quite ugly. In particular, I don't know how to
> write a decent scanner in D. What I did is truly scary, a way to
> force C code (with gets and ungetc) into my zero knowledge of D. What
> is the right way to do the following?
[...]
ungetc is a truly nasty hack of an API in C; is it really necessary to
support that? D supports a range API that lets you query the front of a
range (in this case, a stream of chars) without moving the current
position of the stream. So ungetc really shouldn't be necessary unless
it's an inextricable part of the Bison-generated parser.
What I'd do is to templatize CalcLexer on an arbitrary input range of
chars, and leave the specifics of binding to a File (or whatever else,
like a string in a unittest) to the caller. And I wouldn't bother with
using class inheritance at all, since I can't envision we'd ever need to
swap in multiple lexers to the same parser. So something like this:
-----snip-----
import std.range.primitives;
// Convenience method to instantiate CalcLexer (so that you don't have
// to name the range type explicitly).
auto calcLexer(R)(R range)
if (isInputRange!R && is(ElementType!R : dchar))
{
return CalcLexer!R(range);
}
struct CalcLexer(R)
if (isInputRange!R && is(ElementType!R : dchar))
{
private R input;
private YYSemanticType semanticVal_;
@property YYSemanticType semanticVal()
{
return semanticVal_;
}
int yylex()
{
import std.uni : isWhite, isNumber;
// Skip initial spaces
while (!input.empty && isWhite(input.front))
input.popFront;
// Handle EOF
if (input.empty)
return YYTokenType.EOF;
// Numbers
assert(!input.empty);
if (input.front == '.' || input.front.isNumber)
{
import std.conv : parse;
semanticVal_.ival = input.parse!int;
return YYTokenType.NUM;
}
// Individual characters
auto ch = input.front;
input.popFront;
return ch;
}
}
-----snip-----
On the caller's side, you'll need to somehow get a range of characters
out of a File, for the sake of the example. I'd do something like this:
import std.algorithm : map, joiner;
import std.stdio;
import std.utf : byDchar;
File inputFile = stdin; // for example
auto lexer = inputFile
.byChunk(1024) // avoid making a syscall roundtrip per char
.map!(chunk => cast(char[]) chunk) // because byChunk returns
ubyte[]
.joiner // combine chunks into a single virtual range
of char
.byDchar // UTF-8 decode (optional)
.calcLexer; // instantiate CalcLexer object
... // pass `lexer` to the Bison parser
T
--
Life begins when you can spend your spare time programming instead of watching
television. -- Cal Keegan