chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] problems string-trimming on UTF8


From: Kristian Lein-Mathisen
Subject: Re: [Chicken-users] problems string-trimming on UTF8
Date: Mon, 30 Jan 2017 09:33:09 +0100


Oh, I see! I thought utf8 would 'patch up' the existing procedures. Now I have a fix, thanks Peter!

K.

On Fri, Jan 27, 2017 at 2:50 PM, Peter Bex <address@hidden> wrote:
On Fri, Jan 27, 2017 at 02:36:55PM +0100, Kristian Lein-Mathisen wrote:
> Oh no, what happened?
>
> $ csi -R utf8 -R srfi-13 -p '(string-trim-right "Zazà")'
> Zaz�
>
>
> utf8 doesn't seem to do it! But utf8, at least, gets the string-length
> right:
>
> $ csi -R srfi-13 -p '(string-length "Zazà")'
> 5
> $ csi -R utf8 -R srfi-13 -p '(string-length "Zazà")'
> 4
>
> It took me a while to figure out what was going on. These are the bytes of
> Zazà:
>
> $ printf 'Zazà' | xxd
> 00000000: 5a61 7ac3 a0                             Zaz..

The srfi-13 module is pretty FUBAR: together with srfi-14, they both
assume latin1.  That's why the utf-8 module ships with a utf-8-srfi-13
module:

$ csi -R utf8 -R utf8-srfi-13 -p '(string-trim-right "Zazà")'
Zazà

Cheers,
Peter


reply via email to

[Prev in Thread] Current Thread [Next in Thread]