chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg


From: John Cowan
Subject: Re: [Chicken-users] ditching syntax-case modules for the utf8 egg
Date: Tue, 18 Mar 2008 14:47:18 -0400
User-agent: Mutt/1.5.13 (2006-08-11)

Tobia Conforto scripsit:

> This discussion has convinced me that from a *practical* point of  
> view, it makes a lot of sense to use the same underlying object for  
> both kinds of operation, instead of copying over the contents every  
> time you want to switch between the two views (as I suppose it happens  
> for example in Java, with strings and byte arrays.)

If we all lived in a UTF-8/LF world exclusively, then that would be fine.
As it is, many of us are not in that world at all, and few of us are in it
exclusively.  So in practice it is necessary to convert between internal
and external encodings anyhow, which involves copying in the general case.

> Having the string API operate on UTF-8 characters and having a new API  
> to operate on bytes, *both on the same underlying string objects*,  
> will let us have the cake and eat it too, at the expense of changing  
> the meaning of the string API for all existing applications.

Not the *meaning* of it, just the big-O.  "A Lisp programmer is someone
who knows the value of everything and the cost of nothing."

> The dynamic nature of Scheme suggests that it will all work  
> seamlessly, until someone tries to call a (now Unicode-aware) string- 
> length on a string whose UTF-8 structure had been corrupted with byte- 
> level operations. At which point a runtime error will kindly signal  
> the situation ;-)

In the reverse case, though, the contents of the string will be silently
corrupted, as when I change the nth character of an ASCII string to
#\U4E00 and the n+1th and n+2th characters are destroyed.

-- 
Andrew Watt on Microsoft:                       John Cowan
Never in the field of human computing           address@hidden
has so much been paid by so many                http://www.ccil.org/~cowan
to so few! (pace Winston Churchill)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]