[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
From: |
Alex Shinn |
Subject: |
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg |
Date: |
Tue, 18 Mar 2008 20:24:32 +0900 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.1.50 (darwin) |
>>>>> "Peter" == Peter Bex <address@hidden> writes:
Peter> On Tue, Mar 18, 2008 at 11:41:08AM +0900, Alex Shinn wrote:
>> >>>>> "Kon" == Kon Lovett <address@hidden>
>> writes:
>>
Kon> Summary: I want a byte-string API. I want string
Kon> integrations. I want global UTF8 strings.
>>
>> The only way this can happen is to push the UTF8
>> handling into the core of Chicken itself.
>>
>> However it would be contrary to Chicken's goal of
>> keeping a minimal core with extensions built on top.
Peter> How much bigger would this make the core? I
Peter> really doubt it would have much of an impact, and
Peter> it would sure make lots of things a lot simpler.
It's hard to say, but it would probably be about the size of
the utf8-lolevel egg (49k on my machine), since the rest is
redefinitions of existing procedures that wouldn't be any
more complex.
I'm not saying I recommend this, I'm just pointing out what
would need to happen for Kon's global utf8 semantics.
Although utf8 wins in many areas compared to other Unicode
representations, it's still more complex than ASCII.
If someone did seriously want to move this into the core and
Felix allowed it, you would want to do it in three phases:
1) provide a full byte-string-level API (mostly
BYTE-STRING-REF and BYTE-STRING-REF) - byte-strings are
the same exact objects as utf8-strings, we just use
different procedures
2) locate any modules that treat strings as byte-strings
and update them to use the new byte-string API
3) replace the core string operations with utf8 versions
4) replace SRFI-14 with the Unicode version (this requires
the iset egg to be moved into the core Chicken
distribution, though it needn't be loaded by default)
SRFI-13 makes very, very heavy use of string indices, so
that idiom happens to be slow with utf8 strings (which is a
different thing from saying utf8 is slow). It would be best
to then provide a string-cursor based string library and
encourage its use instead of SRFI-13.
--
Alex
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, (continued)
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alaric Snell-Pym, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Graham Fawcett, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Kon Lovett, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Kon Lovett, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/17
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Peter Bex, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, F. Wittenberger, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg,
Alex Shinn <=
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Felix Winkelmann, 2008/03/15