coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

multibyte support (round 3)


From: Assaf Gordon
Subject: multibyte support (round 3)
Date: Mon, 19 Sep 2016 02:11:29 -0400

Hello,

Updated patch attached.

Improvements from last time ( 
http://lists.gnu.org/archive/html/coreutils/2016-09/msg00011.html ):

1. 'multibyte' and 'mbbuffer' are in gl/ , behave more like gnulib modules.
Tests cover all items mentioned in Markus Kuhn's UTF-8 decoder page
(https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt).

2. cygwin/UTF-16 surrogates are handled transparently in 'mbbuffer'.
Applications under cygwin see 'ucs4_t' and don't need to worry about surrogates 
(but, wcwidth() will present some problem). Tests ensure parsing under cygwin 
behaves like other systems.

3. 'cut' supports multibyte '-c' and '-n -b' (but not multibyte '-d' yet).
Some tests included.


Comments welcomed,
 - assaf


Attachment: multibyte-2016-09-19.patch.xz
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]