Dear CHICKEN mailing list,
I encountered a strange issue with string-trim-right and some UTF8 string:
$ csi -R srfi-13 -p '(string-trim "Zazà")'
$ csi -R srfi-13 -p '(string-trim-right "Zazà")'
$ csi -R utf8 -R srfi-13 -p '(string-trim-right "Zazà")'
utf8 doesn't seem to do it! But utf8, at least, gets the string-length right:
$ csi -R srfi-13 -p '(string-length "Zazà")'
5
$ csi -R utf8 -R srfi-13 -p '(string-length "Zazà")'
It took me a while to figure out what was going on. These are the bytes of Zazà:
00000000: 5a61 7ac3 a0 Zaz..
So it seems like string-trim-right just looks at the last byte,
\xa0 which is a
non-breaking space in itself, and then dropping that off. It should be looking at the last utf8 codepoint instead.
I don't know if this is a known bug or if I've come across something undiscovered. I suppose the fix belongs in the utf8 egg.
Thanks!
K.