bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23185: GNU grep matching discrepancy between -a/--text and not.


From: Paul Eggert
Subject: bug#23185: GNU grep matching discrepancy between -a/--text and not.
Date: Tue, 5 Apr 2016 23:56:22 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

Thanks for pointing out the seeming inconsistency. The documentation mentions the issue but is perhaps not clear enough, so I installed the attached patch.

The input file contains NUL bytes and so is treated as binary data, and the grep documentation (secton "File and Directory Selection", option "--binary-files") says "When processing binary data, ‘grep’ may treat non-text bytes as line terminators". This behavior was added to GNU grep in release 2.21 dated 2014, partly for performance reasons.

There are two instances in riddle.he of a space followed by a NUL byte, so

  grep -P '[ \t]\r?$' riddles.he

finds a match when the $ matches just before the NUL byte.

-a is one way to get the behavior you evidently expected. Another (perhaps better) way is -z. The command:

  grep -zP '[ \t]\r?\n' riddles.he

outputs nothing and exits with status 1.

Attachment: 0001-Give-another-example-of-binary-file-processing.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]