[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#15199: UTF-16 surrogate pair handling in grep -i option
From: |
Paolo Bonzini |
Subject: |
bug#15199: UTF-16 surrogate pair handling in grep -i option |
Date: |
Tue, 27 Aug 2013 17:53:25 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130805 Thunderbird/17.0.8 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Il 20/08/2013 17:11, Corinna Vinschen ha scritto:
> That's what I did when I started to write this patch, but then I
> decided against it for the following reason:
>
> The implementation of mbrtowc, wcrtomb and towlower using UTF-16
> wchar_t works *only* in the Cygwin/Newlib-provided functions in
> exactly the way used in this patch. I'm not aware that any other
> platform provides an equivalent implementation, even if wchar_t is
> 2 bytes. Thus, the assumption that the code works in all cases in
> which sizeof (wchar_t) == 2, is wrong. It would, for instance,
> not work with the Windows implementation of wcrtomb, AFAIK.
Right, MSVCRT is exactly what I was thinking about.
> I'm not strongly opposed to changing this, but IMHO, to be on the
> safe side, this code should only be activated on a case by case
> basis, so only for Cygwin for now. Same with a potential fix to
> the regex compiler, for which I have no idea how to do it, yet :(
Feel free to bug me on IRC if I can be of any help.
Paolo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBAgAGBQJSHMt1AAoJEBvWZb6bTYbySD8P/3vFn0FEGTQDpoHkUK0aysNH
ShyBFQ9AywNr0WYCWc+lg7uq9NpjNnonXtawOvoq+RYCNCqPJ16/fjqDe2bFGrR6
qifSuVQioK2D//r1Y7FfMANn1lzFfiBrhIpDBT/bLP/1i57VjbivZACgFdLnyTTN
olN9uNIl0EebVLkMdiF363DWP7ZmJh8pfi8C1cW0VeTT77kgYTRppFaQfuY9K1SA
2bQj8hzKqyzJkXkHTPow5cvby9moZ/wKSjjduYXxpNNRvn9KGY67E7nv/s/FDxHq
R6KzttHCCWVprlHCE2laykQY4sawpkMkEMoIYWjXIyuw6q7/DiLPxY3AnwE8PMLR
u0Vv1SDLbvCiCx+FZgCrChP3lXojKqi1QNyYdcwgBLracYNw4Z5ASatol7yYKJJW
IozVn4iWkp4sK/lZlOmWykNdNzA9iLTTrw4BHdCxBBxtSl0/jjaTCzXp6QcVXYhe
2Ey6RHikOkF3Gn01CuaAvqv06oJYFnBROw+zimb4lZH0TgEyQxaxmlkutF2UKwLs
HYEx/GJtwLjpExEjdpNG8ZD6wZ3+TO2oBVat1zZHq8AsJy58RK6I0P7Iwy4T7kDu
yO+8eLxLkJ2dFphW1WHULl+AR46GE7sG1kz3rZvGI6Rj5UDhCdCkXK6G4nmPwnDE
NNzyQOieb3Q9EWyrsy1g
=LJSZ
-----END PGP SIGNATURE-----
- Re: UTF-16 surrogate pair handling in grep -i option, (continued)
- Re: UTF-16 surrogate pair handling in grep -i option, Jim Meyering, 2013/08/25
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/26
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Jim Meyering, 2013/08/27
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Corinna Vinschen, 2013/08/27
- bug#15192: UTF-16 surrogate pair handling in grep -i option, Jim Meyering, 2013/08/31
Re: UTF-16 surrogate pair handling in grep -i option, Paolo Bonzini, 2013/08/20