[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] problems string-trimming on UTF8
From: |
Peter Bex |
Subject: |
Re: [Chicken-users] problems string-trimming on UTF8 |
Date: |
Fri, 27 Jan 2017 14:50:04 +0100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Fri, Jan 27, 2017 at 02:36:55PM +0100, Kristian Lein-Mathisen wrote:
> Oh no, what happened?
>
> $ csi -R utf8 -R srfi-13 -p '(string-trim-right "Zazà")'
> Zaz�
>
>
> utf8 doesn't seem to do it! But utf8, at least, gets the string-length
> right:
>
> $ csi -R srfi-13 -p '(string-length "Zazà")'
> 5
> $ csi -R utf8 -R srfi-13 -p '(string-length "Zazà")'
> 4
>
> It took me a while to figure out what was going on. These are the bytes of
> Zazà:
>
> $ printf 'Zazà' | xxd
> 00000000: 5a61 7ac3 a0 Zaz..
The srfi-13 module is pretty FUBAR: together with srfi-14, they both
assume latin1. That's why the utf-8 module ships with a utf-8-srfi-13
module:
$ csi -R utf8 -R utf8-srfi-13 -p '(string-trim-right "Zazà")'
Zazà
Cheers,
Peter
signature.asc
Description: Digital signature