[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
From: |
Graham Fawcett |
Subject: |
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg |
Date: |
Mon, 17 Mar 2008 23:01:53 -0400 |
On Mon, Mar 17, 2008 at 10:29 PM, Alex Shinn <address@hidden> wrote:
> >>>>> "Graham" == Graham Fawcett <address@hidden> writes:
>
> Graham> On Mon, Mar 17, 2008 at 11:22 AM, Kon Lovett <address@hidden>
> wrote:
>
> Graham> The Factor language borrowed from Larceny a
> Graham> clever mechanism for representing Unicode
> Graham> strings efficiently. Perhaps such a system is
> Graham> feasible for Chicken, and might eliminate some
> Graham> of these issues (at the cost of distancing our
> Graham> string type a bit more from C char arrays):
[snip]
> This only adds news issues, and solves none of the old ones.
> The representation itself is interesting, though it may in
> fact be a pessimisation in many cases (utf8 is about the
> fastest approach for parsing and regex matching, which are
> the string operations where speed is the biggest issue to
> begin with).
Fair enough.
Here's another thought. It seems to me that if we were to represent
strings as composite values, e.g. a two-slot record whose first slot
is an encoding (the symbol 'utf8, or #f for 'byte' encoding), and
whose second slot contains the string data, then the various string
functions could dispatch on the type, and there would be no need to
monkey-patch core string functions to get the desired semantics. A
proper protocol for handling string encodings could be designed, utf8
being one of those encodings.
I don't imagine the dispatch overhead would be significant in any but
the tightest inner loops, in which case one could resort to
fully-specified functions (e.g. byte-string-length or
utf8-string-length).
Graham
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, (continued)
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/16
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/16
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Kon Lovett, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Graham Fawcett, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg,
Graham Fawcett <=
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Tobia Conforto, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alaric Snell-Pym, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Graham Fawcett, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Leonardo Valeri Manera, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Tobia Conforto, 2008/03/18