coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unex


From: Assaf Gordon
Subject: Re: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr
Date: Wed, 17 Jan 2018 17:38:34 -0700
User-agent: NeoMutt/20170113 (1.7.2)

Hello,

On Wed, Jan 17, 2018 at 02:53:21PM -0800, Eric Fischer wrote:
> * My tr will not remove bytes from the middle of characters
> [...]
> is arguably an error in the test, because POSIX specifies that octal
> escapes represent characters, not bytes.

Please see previous discussion here:
https://lists.gnu.org/r/coreutils/2017-09/msg00026.html

Where it was decided that backward-compatability (and preventing regression)
is more important than POSIX's requirements.
Specifying octal values in the SETs always reverts to unibyte processing.

> * Linux and MacOS disagree about whether nonbreaking space is a space or a
> graphic character
[...]
> is a portability problem
> that should probably be solved by the use of a different character in the
> test.

That is a good point. Current tests have comments about the behaviour under
Ubuntu/Glibc2.23. It would be beneficial to clearly document the behaviour
under Mac OS X (and other BSDs).

Also relevant: http://jkorpela.fi/chars/spaces.html .

regards,
 - assaf



reply via email to

[Prev in Thread] Current Thread [Next in Thread]