chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg


From: Alex Shinn
Subject: Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
Date: Tue, 18 Mar 2008 11:29:13 +0900
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1.50 (darwin)

>>>>> "Graham" == Graham Fawcett <address@hidden> writes:

    Graham> On Mon, Mar 17, 2008 at 11:22 AM, Kon Lovett <address@hidden> wrote:

    Graham> The Factor language borrowed from Larceny a
    Graham> clever mechanism for representing Unicode
    Graham> strings efficiently. Perhaps such a system is
    Graham> feasible for Chicken, and might eliminate some
    Graham> of these issues (at the cost of distancing our
    Graham> string type a bit more from C char arrays):

    Graham> http://factor-language.blogspot.com/2008_01_01_archive.html

    Graham> "The new representation is quite clever, and
    Graham> comes from Larceny Scheme. The idea is that
    Graham> strings are ASCII strings, but have an extra
    Graham> slot pointing to an 'auxiliary vector'.

This only adds news issues, and solves none of the old ones.
The representation itself is interesting, though it may in
fact be a pessimisation in many cases (utf8 is about the
fastest approach for parsing and regex matching, which are
the string operations where speed is the biggest issue to
begin with).

The problems we're having aren't even about string
representation though, they're about the semantics of the
string operations themselves.  Are the string indices byte
positions or character positions?  Different libraries
disagree.

-- 
Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]