bug#22655: grep -Pz '^' now fails!

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22655: grep -Pz '^' now fails!

From:	Paul Eggert
Subject:	bug#22655: grep -Pz '^' now fails!
Date:	Fri, 18 Nov 2016 15:37:16 -0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

Stephane Chazelas wrote:

2016-11-18 08:48:04 -0800, Paul Eggert:

Stephane Chazelas wrote:

Why would it make it slower. AFAICT, PCRE_MULTILINE *adds*
some overhead.


As I understand it, PCRE_MULTILINE lets 'grep' apply a pattern to an
entire buffer that contains many lines, and this lets PCRE
efficiently find the first match in the whole buffer. If grep
doesn't use PCRE_MULTILINE, grep would have to apply the pattern to
each line separately, which could be significantly slower.

[...]

That might have been the case a long time ago, as I remember
some discussion about it as it explained some wrong information
in the documentation, but as far as I and gdb can tell, grep
2.26 at least call pcre_exec for every line of the input with
grep -P.

Although that was true starting with commita14685c2833f7c28a427fecfaf146e0a861d94ba (2010-03-04), it became false startingwith commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5 (2014-09-16).

If it didn't

echo test | grep -P '\n$'

would match.

No, because grep omits the trailing newline in that particular input. And forthis example:


printf 'test\n\n' | grep -p '\n$'

grep passes "test\n" to jit_exec, determines that jit_exec returns a match thatcrosses a line boundary, and rejects the match.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#22655: grep -Pz '^' now fails!, (continued)

Prev by Date: bug#22655: grep -Pz '^' now fails!
Next by Date: bug#22655: grep -Pz '^' now fails!
Previous by thread: bug#22655: grep -Pz '^' now fails!
Next by thread: bug#22655: grep -Pz '^' now fails!
Index(es):
- Date
- Thread