bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16586: bug#17245: GREP BUG: grep -P and binary files


From: Paul Eggert
Subject: bug#16586: bug#17245: GREP BUG: grep -P and binary files
Date: Wed, 23 Apr 2014 22:39:10 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

Jim Meyering wrote:
anyone using grep -P to search data that is even a tiny bit
inconsistent with their UTF-8 locale will now get an exit status of
2 rather than the matches they used to get.

Yes, I don't like that either, but <http://bugs.exim.org/1468> says libpcre intends to have undefined behavior here. If so, it wouldn't help to wait until the next libprce release, which may well have a serious bug of this form in a different area, a bug that's not easy to test for.

Perhaps somebody should modify grep -P to discard input lines containing non-UTF-8 data instead of presenting them to libprce. That way, it would be safe for grep -P to use PCRE_NO_UTF8_CHECK. Although grep -P should report an error and exit with status 2 if it discards input due to encoding errors, it can also report matches in lines that do not contain encoding errors, so that users can see both the error messages and the matches.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]