[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: tr is handling bytes not characters
From: |
Eric Blake |
Subject: |
Re: tr is handling bytes not characters |
Date: |
Thu, 05 Feb 2009 05:55:52 -0700 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.19) Gecko/20081209 Thunderbird/2.0.0.19 Mnenhy/0.7.5.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Nick Demou on 2/5/2009 4:20 AM:
> And now about the bug report. It's about "tr". I realized that tr was
> mostly failing when working on utf-8 input.
Thanks for the report. It is a known problem that coreutils does not yet
properly support multi-byte characters (this includes UTF-8), because no
one has yet contributed a patch that efficiently supports this without
penalizing maintenance or performance of single-byte code paths, while
still useful across the wide range of coreutils that need it (more than
just tr are affected). Several distros have written add-on patches which
attempt to provide multibyte support, but none of those patches has been
incorporated upstream.
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkmK4dgACgkQ84KuGfSFAYDwzgCaA2DoZldDHwPWFerD4oHKDAEw
ZP8AoK2FcW8KI2A6ORlbr2mQnFWgMLba
=xnDC
-----END PGP SIGNATURE-----