[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#17640: grep with -m reads the entire input
From: |
Jim Meyering |
Subject: |
bug#17640: grep with -m reads the entire input |
Date: |
Fri, 30 May 2014 09:34:55 -0700 |
On Fri, May 30, 2014 at 8:58 AM, Jim Meyering <address@hidden> wrote:
> On Fri, May 30, 2014 at 8:56 AM, Jim Meyering <address@hidden> wrote:
>> On Thu, May 29, 2014 at 10:45 PM, Marc Aldorasi <address@hidden> wrote:
>>> With grep 2.18, the -m option would cause grep to stop reading input
>>> after printing the requested number of matching lines. With version
>>> 2.19, grep reads the entire input before exiting. Interestingly, grep
>>> does not read the entire input if the -c or -C0 options are added in
>>> addition to -m, and also when using -l or -q instead of -m. I believe
>>> this is caused by commit 5122195.
>>
>> Thanks a lot for the report. Just in time.
>> I confirm that it's a bug introduced in 2.19.
>> To test, run "seq 1000000 > million", then
>> "strace -e read grep 0 million" first using grep-2.18
>> (shows just a few read syscalls), and then with 2.19,
>> which shows grep reading the entire million-line file.
>
> Correction: to reproduce, you'll have to insert -m1 in that grep command.
>
>> Here's an incomplete patch. Obviously there's a lot more
>> to be added, including NEWS and a nontrivial test. This
>> was introduced by commit v2.18-140-g6f07900
This bears some explanation. I've attached a more complete patch
(albeit still hastily composed, so I'll wait a few hours,
in case there's feedback)
Prior to grep-2.19, with --max-count=N, this first disjunct would
be true after the Nth match, because pending would be 0:
if ((!outleft && !pending) || (nlines && done_on_match))
goto finish_grep;
However, a seemingly unrelated change affected how "pending" is set:
pending = out_quiet ? 0 : out_after;
We used to ensure that "out_after" was non-negative, because
default_context was always non-negative:
if (out_after < 0)
out_after = default_context;
But the recent context-related change invalidated that assumption:
- default_context = 0;
+ default_context = -1;
Here's the patch:
0001-grep-fix-max-count-N-m-N-to-stop-reading-after-Nth-m.txt
Description: Text document