[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug with case conversion of UTF-8 characters
From: |
Stephane Chazelas |
Subject: |
Re: bug with case conversion of UTF-8 characters |
Date: |
Mon, 2 Oct 2017 17:49:46 +0100 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
2015-01-22 14:43:00 +0000, Stephane Chazelas:
[...]
> Bash Version: 4.3
> Patch Level: 30
> Release Status: release
>
> (Debian unstable amd64)
>
> $ LC_ALL=tr_TR.UTF-8 bash -c 'typeset -l a; a=İ; echo $a' | hd
> 00000000 69 b0 0a |i..|
> 00000003
[...]
Hi. While, that particular bug seems to have been fixed in 4.4,
it looks like there's still a problem in those Turkish locales
where uppercase i is İ and lowercase I is ı.
$ X=AEIOU LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X,,}"'
aeIou
$ X=aeiou LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X^^}"'
AEiOU
same issue with typeset -l/u
$ X=aeiou LC_ALL=tr_TR.UTF-8 awk 'BEGIN{print toupper(ENVIRON["X"])}'
AEİOU
$ X=AEIOU LC_ALL=tr_TR.UTF-8 awk 'BEGIN{print tolower(ENVIRON["X"])}'
aeıou
Those ones are OK:
$ X=AEİOU LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X,,}"'
aeiou
$ X=aeıou LC_ALL=tr_TR.UTF-8 bash -c 'echo "${X^^}"'
AEIOU
nocasematch seems to be OK as well.
$ bash --version
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
(on Debian).
--
Stephane
- Re: bug with case conversion of UTF-8 characters,
Stephane Chazelas <=