bug#23234: unexpected results with charset handling in GNU grep 2.23

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23234: unexpected results with charset handling in GNU grep 2.23

From:	Paul Eggert
Subject:	bug#23234: unexpected results with charset handling in GNU grep 2.23
Date:	Wed, 6 Apr 2016 18:28:33 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1

On 04/06/2016 02:04 PM, Eric Blake wrote:

POSIX ... says that LC_ALL=C is _required_ to treat all 256 bytevalues as valid characters

Although that was the intent of POSIX, it's not what the currentstandard says, and it's not what many popular platforms do. Problematicplatforms include Fedora 23, where mbrtowc reports an encoding error inthe C locale when given a byte outside the range 0-127. This affectsmany programs other than 'grep'.

This bug in the standard is intended to be fixed in a future version ofPOSIX (see <http://austingroupbugs.net/view.php?id=663#c2738>). Isuppose glibc and eventually Fedora will be fixed to conform to the newstandard in due course.

Perhaps grep should work around this problem on systems like Fedora 23where the underlying C library does not conform to the next version ofPOSIX. It sounds like a new gnulib module or two might do the trick.This should fix the problems that Björn mentions.

In the meantime grep -a is the way to go. Yes, it's not portable tonon-GNU grep, but there is no portable solution given the abovementionedPOSIX problems, so a GNU-grep-only workaround is all one can reasonablyask for.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#23234: unexpected results with charset handling in GNU grep 2.23, Björn JACKE, 2016/04/06
- bug#23234: unexpected results with charset handling in GNU grep 2.23, Eric Blake, 2016/04/06
  - bug#23234: unexpected results with charset handling in GNU grep 2.23, Bjoern Jacke, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Eric Blake, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Bjoern Jacke, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Eric Blake, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Norihiro Tanaka, 2016/04/09
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/09
  - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert <=
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/09
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/10
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Jim Meyering, 2016/04/10
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Zev Weiss, 2016/04/10
    - bug#20768: bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/10
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Jim Meyering, 2016/04/11
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/11
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Jim Meyering, 2016/04/11

Prev by Date: bug#23234: unexpected results with charset handling in GNU grep 2.23
Next by Date: bug#23234: unexpected results with charset handling in GNU grep 2.23
Previous by thread: bug#23234: unexpected results with charset handling in GNU grep 2.23
Next by thread: bug#23234: unexpected results with charset handling in GNU grep 2.23
Index(es):
- Date
- Thread