--- Begin Message ---
Subject: |
Building Grep with PCRE support |
Date: |
Thu, 2 Apr 2020 19:17:35 -0400 |
Hi Everyone,
I'm working on OS X 10.12. When I build Grep 3.3 from source tarball I see:
checking for PCRE... no
checking for pcre_compile... no
configure: WARNING: GNU grep will be built without pcre support.
PCRE2 is installed in /usr/local. PKG_CONFIG_PATH is set and points to
/usr/local/lib/pkgconfig.
Configure _lacks_ an option --with-pcre-prefix=... Attempting to use it:
configure: WARNING: unrecognized options: --with-pcre-prefix
Would someone know how to build Grep with PCRE support?
Thanks in advance.
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#47264: [PATCH v2] pcre: migrate to pcre2 |
Date: |
Sun, 14 Nov 2021 12:45:29 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 |
On 11/9/21 02:58, Carlo Marcelo Arenas Belón wrote:
Sadly, hadn't been able to generate a release,
Does this mean you're having trouble running 'make dist'? If so, what's
the trouble?
it seems to be ready for some broader testing, specially if the
attached patch is applied on top of a 10.37 release (tested that way
in OpenBSD i386)
OK, thanks, I installed it into the Savannah master copy of GNU grep,
except that I didn't rename m4/pcre.m4 to m4/pcre2.m4, or rename the
macros to use PCRE2. This made the change easier to audit. Revised patch
0001 attached.
Also, I followed up with several related patches (also attached as
0002-0012). Please take a look at them and let us know of any problems.
In the attached patch "grep: prefer signed integers" I followed the
usual grep approach of preferring signed to unsigned integers (e.g.,
idx_t to size_t) when either will do; this lets us debug better with
-fsanitize=undefined to catch integer overflow.
One issue I discovered: PCRE2_EXTRA_MATCH_WORD (which is used by
pcre2grep -w) is incompatible with 'grep -w'. For example, 'echo a%%a |
grep -Pw %%' outputs nothing, whereas 'echo a%%a | pcre2grep -w %%'
outputs 'a%%a'. I think the GNU grep behavior (which is the same as with
'grep -w', either on Linux or OpenBSD) is more intuitive here: do you
happen to know why PCRE behaves the way it does? Is that worth a PCRE2
bug report? Anyway, the attached patches avoid using
PCRE2_EXTRA_MATCH_WORD for that reason.
* no more version restrictions (should work with >~10.20)
I tested with 10.00 and found one more glitch (it doesn't have
PCRE2_SIZE_MAX), which is fixed by the attached patch "grep: port to
PCRE2 10.20".
Pending:
* what to do with the current support of \C (enabled for now)
Let's open another bug report for that; I'm still a bit fuzzy about what
the pros and cons are.
* merge of non critical bugfix in #51710[1]
I plan to follow up in that bug report.
Marking this bug as done. Thanks again for working on this.
0001-grep-migrate-to-pcre2.patch
Description: Text Data
0002-maint-minor-rewording-and-reindenting.patch
Description: Text Data
0003-grep-Don-t-limit-jitstack_max-to-INT_MAX.patch
Description: Text Data
0004-grep-improve-pcre2_get_error_message-comments.patch
Description: Text Data
0005-grep-speed-up-fix-bad-UTF8-check-with-P.patch
Description: Text Data
0006-grep-prefer-signed-integers.patch
Description: Text Data
0007-grep-use-PCRE2_EXTRA_MATCH_LINE.patch
Description: Text Data
0008-grep-simplify-JIT-setup.patch
Description: Text Data
0009-grep-improve-memory-exhaustion-checking-with-P.patch
Description: Text Data
0010-grep-use-ximalloc-not-xcalloc.patch
Description: Text Data
0011-grep-fix-minor-P-memory-leak.patch
Description: Text Data
0012-grep-port-to-PCRE2-10.20.patch
Description: Text Data
--- End Message ---