chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg


From: Alex Shinn
Subject: Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
Date: Tue, 11 Mar 2008 17:05:27 +0900
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1.50 (darwin)

>>>>> "Robin" == Robin Lee Powell <address@hidden> writes:

    Robin> Replying to very old mail.  :)

Very old, but still relevant, and I was actually about to
bring this up again myself.  I have the new code ready to
check in (complete with Unicode 5.0 updates), but want to
reconfirm.

    Robin> On Thu, Jun 28, 2007 at 12:25:54PM +0900, Alex Shinn wrote:

    >> I'm considering changing the utf8 egg to no longer
    >> use syntax-case modules, so that it would work like
    >> the numbers egg.
    [...]

    Robin> Perhaps I'm missing something, but I *really*
    Robin> don't like the idea of having to manage my own
    Robin> personal versions of spiffy and all of its
    Robin> dependencies just to get utf8 support.  That
    Robin> seems really sub-optimal.

I'm not entirely sure why you think spiffy would need two
versions.

First, are you referring to the syntax-case version of utf8
or the new version that would globally redefine string
operations?

Second, does spiffy itself do any operations on strings
where it would matter whether the strings were utf8 or not?
I somehow doubt this.  I could easily see individual
extensions or servlets needing this behavior, but in either
the old or the new approach those extensions could just use
utf8 themselves, without any need to change spiffy.

I think possibly the utf8 string representation may be
confusing some people though.  Basically, when using the
utf8 egg, all Chicken strings are considered to be utf8
encoded.  However, code compiled with the default
integrations will treat them essentially as byte-vectors,
ignoring utf8 semantics.  For parsing URI's, or handling
pathnames, or most common tasks that spiffy would do,
*either* assumption works.

Individual servlets could then be compiled without the
default integrations, and they would see the utf8 semantics
if the utf8 egg were used.  Both types of libraries can
co-exist, and they would all share the same string
representation.

This is similar to how the numbers egg works - some code is
compiled with (declare fixnum), assuming all numbers are
small integers, and some code references the global bindings
for arithmetic operations, which can be extended by eggs
like numbers.  Subtle errors can result, e.g. code may work
with the numbers egg but break without (or vice versa)
because as numbers get large they become inexact without the
numbers egg, but stay exact with it (as bignums), but these
can be avoided manually.  Similar may happen using utf8
globally, but it's convenient and some people want it - it
lets you turn on Unicode support optionally and globally for
all of Chicken, while allowing optimizations for some
individual eggs.

But I want to ask again, do people want this, and is it OK
to break compatibility in the current utf8 egg?  Or should
we possibly wait to see about the new module system?

-- 
Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]