[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
From: |
Graham Fawcett |
Subject: |
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg |
Date: |
Mon, 17 Mar 2008 11:33:08 -0400 |
On Mon, Mar 17, 2008 at 11:22 AM, Kon Lovett <address@hidden> wrote:
> Summary: I want a byte-string API. I want string integrations. I want
> global UTF8 strings.
The Factor language borrowed from Larceny a clever mechanism for
representing Unicode strings efficiently. Perhaps such a system is
feasible for Chicken, and might eliminate some of these issues (at the
cost of distancing our string type a bit more from C char arrays):
http://factor-language.blogspot.com/2008_01_01_archive.html
"The new representation is quite clever, and comes from Larceny
Scheme. The idea is that strings are ASCII strings, but have an extra
slot pointing to an 'auxiliary vector'. If no auxiliary vector is set,
the nth character of the string is just the nth byte. If an auxiliary
vector is set, then the nth character has the nth byte as the least
significant 8 bits, and the most significant 13 bits come from the nth
double-byte in the auxiliary vector. Storing a non-ASCII character
into the string creates an auxiliary vector if necessary. This reduces
space usage for ASCII strings, it can represent every Unicode code
point, and for strings with high code points in them, it still uses
less space than the other alternative, UTF-32."
So, a byte string would simply be a string with a null auxilliary vector.
Graham
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, (continued)
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/16
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/16
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Kon Lovett, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg,
Graham Fawcett <=
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Graham Fawcett, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Tobia Conforto, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alaric Snell-Pym, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Graham Fawcett, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Leonardo Valeri Manera, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/18