[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: tr is handling bytes not characters
From: |
Nick Demou |
Subject: |
Re: tr is handling bytes not characters |
Date: |
Fri, 6 Feb 2009 15:22:25 +0200 |
On Thu, Feb 5, 2009 at 2:55 PM, Eric Blake <address@hidden> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> According to Nick Demou on 2/5/2009 4:20 AM:
>> And now about the bug report. It's about "tr". I realized that tr was
>> mostly failing when working on utf-8 input.
>
> Thanks for the report. It is a known problem that coreutils does not yet
> properly support multi-byte characters (this includes UTF-8), because no
> one has yet contributed a patch that efficiently supports this without
> penalizing maintenance or performance of single-byte code paths, while
> still useful across the wide range of coreutils that need it
Thanks for the info Eric. I was almost sure this would be the case. In
fact I don't consider this as the main topic of my bug report. The
main topic for me is the documentation. The man and info page don't
make it clear that utf-8 is not supported. I believe that others after
me will spend a lot of time just to realize that "it's just a missing
feature". Do you have any thoughts regarding my suggestions on the
documentation?
> --
> Don't work too hard, make some time for fun as well!
:) REALLY good advice Eric.
--
"The software is licensed, not sold" -- MICROSOFT LICENSE TERMS