chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] UTF-8 support in eggs


From: Mario Domenech Goulart
Subject: Re: [Chicken-users] UTF-8 support in eggs
Date: Tue, 08 Jul 2014 14:00:13 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)

Hi Alex,

On Tue, 8 Jul 2014 12:42:21 +0900 Alex Shinn <address@hidden> wrote:

> On Tue, Jul 8, 2014 at 5:58 AM, Mario Domenech Goulart <address@hidden> wrote:
>
>     I want to use some eggs and I need them to handle UTF-8. By
>     "handle UTF-8" I mean "treat strings as UTF-8", so that
>     
>     (string (string-ref "ç" 0)) => "ç"
>     
>     for example.
>     
>     CHICKEN's string-related procedures "accept" UTF-8 strings, but it
>     doesn't mean they will correctly handle them.
>
> It also doesn't necessarily mean they will mishandle them.
> It might help the discussion if we had a list of eggs which
> are known to break on UTF-8 inputs.

Right.  I don't know an easy way to obtain that list.  Maybe grepping
the henrietta-cache for encoding-sensitive operations like string-ref,
string-set!  etc.?  Then check if the results don't use utf8 already.

>     I need UTF-8 support in some eggs that currently don't handle
>     UTF-8.  Assuming we won't have proper UTF-8 support in the core
>     anytime soon, what's the best way to approach this? Here are some
>     options I thought (I must tell in advance none sounds good to me):
>     
>     1. Have <egg> and <egg>-utf8 variants. Or, more generally, <egg>
>        and <egg>-<encoding> variants. That would turn our coop into a
>        disgusting mess and would be a nightmare to egg authors.
>     
>     2. Make eggs install <egg> and <egg>-<encoding> modules. So, you
>        can (use <egg>) or (use <egg>-<encoding>) depending on your
>        needs.
>     
>     3. Manually forking and patching eggs on the user end.
>
> 4. Make affected eggs functors on the set of basic string operations.

Wouldn't 4 be an implementation method of 2?

> The same approaches also apply to eggs needing the full numeric tower,
> though with UTF-8 there's less chance of breakage when mixing eggs
> which do and don't use the utf8 egg.

Yeah, the numeric tower is indeed another can of worms.

Best wishes.
Mario
-- 
http://parenteses.org/mario



reply via email to

[Prev in Thread] Current Thread [Next in Thread]