bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17245: GREP BUG: grep -P and binary files


From: Jim Meyering
Subject: bug#17245: GREP BUG: grep -P and binary files
Date: Sun, 13 Apr 2014 16:17:25 -0700

On Sun, Apr 13, 2014 at 12:13 PM, Jim Meyering <address@hidden> wrote:
> On Fri, Apr 11, 2014 at 4:47 PM, damon <address@hidden> wrote:
>> Hi there -
>>
>> I recently noticed a bug after upgrading grep and have tracked it
>> through a few versions now.
>>
>> I was using grep -P (PCRE grep) in some scripts to grep through
>> directory of files, and the process would keep aborting with a
>> segmentation fault.
>>
>> The last known good version is grep-2.14.  Every version after that has
>> failed in a slightly different way, making me think this could be a bug
>> in grep, not in pcre.
>>
>> I tried compiling greps 2.14 through 2.18 against the latest pcre
>> library, pcre-8.33.  Here's what happens when i try each version against
>> a random binary file, attached to this message as test-image.png.  This
>> file was just one of many that caused the errors, though not every
>> binary file does.
>>
>> Below are some results demonstrating what's going wrong.  Note that all
>> of these seem to work fine with regular grep or with grep -E.  Please
>> let me know what else i can do to help track this down!
>>
>> # grep-2.14/src/grep -P '\[.?max' test-image.png
>> (works, does not match)
> ...
>> # grep-2.18/src/grep -P '\[.?max' test-image.png
>> Segmentation fault
>>
>> # grep-2.18/src/grep -P '.?ma' test-image.png
>> Segmentation fault
>>
>> # grep-2.18/src/grep -P '.?m' test-image.png
>> Binary file test-image.png matches
>
> Thank you for the bug report.
> That is due to a bug in libpcre.  I've confirmed that it is still
> triggered even when using the latest grep.git linked with
> the latest from pcre.git (latest commit has "Final tidies for
> 8.35 release." as the subject).  I built grep as usual, and
> then ran this:
>
>   rm src/grep; make LIB_PCRE=$PWD/../pcre/.libs/libpcre.a
>
> Confirm that grep is not using a shared libpcre (this must print nothing):
>
>   ldd src/grep|grep pcre
>
> That presumes I had already built the latest pcre/ in ../pcre.
> Then, run this to test it with a non-UTF8 locale, and it is
> error-free, correctly finding no match:
>
>   LC_ALL=ja_JP.eucJP valgrind src/grep -P '\[.?max' test-image.png
>
> Repeat using a UTF8 locale, and you see that valgrind reports
> numerous buffer overrun and heap-use-after-free errors:
>
>   LC_ALL=en_US.utf8 valgrind src/grep -P '\[.?max' test-image.png
>
> Here is an equivalent but much smaller test case:
>
>   $ printf 'a\201b\r'|LC_ALL=en_US.utf8 valgrind src/grep -P 'a.?XXb'
>
> That segfaults.  Interestingly, if I replace each X with a ".",
> grep gets into an infinite loop within libpcre's match function.

FYI, I'm pushing the attached patch, to add a test for this.
It fails with the latest pcre from git (8.35), but passes with debian
unstable's libpcre3 8.31-3:

Attachment: k.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]