emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#15758: closed (grep 2.15 calls abort() on larger s


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#15758: closed (grep 2.15 calls abort() on larger searches with -P)
Date: Fri, 13 Dec 2013 18:34:02 +0000

Your message dated Fri, 13 Dec 2013 10:33:35 -0800
with message-id <address@hidden>
and subject line Re: bug#15758: grep 2.15 calls abort() on larger searches with 
-P
has caused the debbugs.gnu.org bug report #15758,
regarding grep 2.15 calls abort() on larger searches with -P
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
15758: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15758
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: grep 2.15 calls abort() on larger searches with -P Date: Wed, 30 Oct 2013 12:42:35 -0400 User-agent: Mutt/1.5.21 (2010-09-15)
Hi,

A user reported a regression with grep 2.15 which is reasonably easy to
reproduce with an invocation such as: ``grep -Pr foo''. The root cause
is that pcre_exec returns an unhandled error (PCRE_ERROR_BADUTF8)
causing grep to call abort().

I bisected the breakage to commit 67436786c110bbb565 (and verified that
it still exists at git HEAD) which essentially introduces utf-8
validation for data. On a large enough file hierarchy, I suppose it's
inevitable that invalid UTF-8 data is encountered. I was able to fix
this with the inline diff which follows:

  diff --git a/src/pcresearch.c b/src/pcresearch.c
  index ad5999d..ce55ab3 100644
  --- a/src/pcresearch.c
  +++ b/src/pcresearch.c
  @@ -176,6 +176,9 @@ Pexecute (char const *buf, size_t size, size_t 
*match_size,
         switch (e)
           {
           case PCRE_ERROR_NOMATCH:
  +#ifdef HAVE_LANGINFO_CODESET
  +        case PCRE_ERROR_BADUTF8:
  +#endif
             return -1;

           case PCRE_ERROR_NOMEMORY:

I don't know if this is considered to be a correct fix, but I offer it
as a starting point for a discussion.

Cheers,
Dave

P.S. Please CC me on replies as I am not subscribed to the list.



--- End Message ---
--- Begin Message --- Subject: Re: bug#15758: grep 2.15 calls abort() on larger searches with -P Date: Fri, 13 Dec 2013 10:33:35 -0800
On Tue, Nov 26, 2013 at 6:30 AM, Santiago <address@hidden> wrote:
> This bug was also reported in Debian ( http://bugs.debian.org/730472 ).
>
> Taking a look on it, I think the most suitable solution for the moment
> is to flag PCRE_NO_UTF8_CHECK instead of PCRE_UTF8, so
> PCRE does not check if inputs are UTF8 valid. Resulting behavior is
> similar to pre-grep-2.15. (See 15758-PCRE-no-check-UTF8.patch)

Thanks for the suggested patches and report.  Your first patch is
almost right.  The problem is that we cannot remove the PCRE_UTF8 flag.
If we did that, it would disable UTF-8, reverting an older fix.
See tests/pcre-utf8 for examples, or run this:

  printf '\342\202\254\n' | LC_ALL=en_US.UTF-8 src/grep -P '^\p{S}'

I've added a commit log, improved a related test and attached
a slightly different patch, but left you as the "Author".
I'll wait for an explicit ACK before pushing it.

With that, there is no need to handle PCRE_ERROR_BADUTF8
because that should not happen.


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]