[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
From: |
John Cowan |
Subject: |
Re: [Chicken-users] ditching syntax-case modules for the utf8 egg |
Date: |
Tue, 18 Mar 2008 16:53:22 -0400 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
Tobia Conforto scripsit:
> Let's see... ASCII is valid UTF-8, so all ASCII external
> representations wouldn't need any encoding or decoding work.
True. However, pure ASCII is less comment than people believe, as
indicated by the 59K Google hits for "8-bit ASCII".
> Most recent formats and protocols require or strongly recommend UTF-8
> (see XML etc.) so those wouldn't need any encoding/decoding either.
Well, there's an awful lot of content on the Internet and on local hard
disks that is neither true ASCII nor UTF-8. In particular, UTF-16 is
the usual representation of Unicode on Windows, and various non-Unicode
character sets are the usual representation of text on Windows, and
consequently on the Web too. UTF-8 is something of an oddity there.
> As far as internal representations covering all Unicode go, UTF-8
> looks like the one incurring in the less overhead, in the general case.
> Not to mention the less work on the developer side, as we already have
> the utf8 egg!
I'm fine with using UTF-8 as our internal representation.
> Unicode/UTF8-aware string operations will perform a correct
> replacement and insert the two extra bytes, if the source string
> really is plain ASCII. If the source string (or just the part near
> the change) is not correct UTF-8 or ASCII to begin with, they will
> raise an error.
You're right.
--
Overhead, without any fuss, the stars were going out.
--Arthur C. Clarke, "The Nine Billion Names of God"
John Cowan <address@hidden>
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, (continued)
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Tobia Conforto, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Tobia Conforto, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg,
John Cowan <=
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Shawn Rutledge, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Shawn Rutledge, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Graham Fawcett, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Alex Shinn, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, felix winkelmann, 2008/03/19
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, John Cowan, 2008/03/18
- Re: [Chicken-users] ditching syntax-case modules for the utf8 egg, Tobia Conforto, 2008/03/18