[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [3.0] UTF-8 and ${#var} or ${var: -1}
From: |
Stephane Chazelas |
Subject: |
Re: [3.0] UTF-8 and ${#var} or ${var: -1} |
Date: |
Thu, 2 Sep 2004 14:19:57 +0100 |
User-agent: |
Mutt/1.5.6i |
On Thu, Sep 02, 2004 at 01:25:00PM +0100, Tim Waugh wrote:
[...]
> +#define MBSLEN(s,n) (((s) && (s)[0]) ? ((s)[1] ? ((s)[2] ?
> mbstowcs(NULL,s,n) : 2) : 1) : 0)
[...]
That doesn't work for strings less that 3 bytes long because of
the optimisation above that can't be applied to multibyte:
$ a=$'\303\251' LC_ALL=fr.UTF-8 truss -t '' -u '*:mbs*' ./bash -c 'echo ${#a}'
2
$ a=$'a\303\251' LC_ALL=fr.UTF-8 truss -t '' -u '*:mbs*' ./bash -c 'echo ${#a}'
-> libc:mbstowcs(0x0, 0xc6278, 0x3, 0x7efefeff)
<- libc:mbstowcs() = 2
2
(in the first case, mbstowcs is not called).
Fix: just disable the optimisation.
--
Stephane
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________