bug#18806: grep -rP getline crashes prematurely (without displaying all

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18806: grep -rP getline crashes prematurely (without displaying all

From:	Paul Eggert
Subject:	bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8
Date:	Sat, 25 Oct 2014 16:11:33 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

Jim Meyering wrote:

after your change,
our pcre-invalid-utf8-input hangs. That happens because the following
infloops (stuck in pcre_exec) on a CentOS6 system:

   printf 'j\202j\nj\nk\202\n' > in; LC_ALL=en_US.utf8 src/grep -P 'k$' in

That binary was linked with the libpcre from this package:

   pcre-7.8-4.el6.x86_64

I'm getting a failure in pcre-invalid-utf8-input both before and after thechange, with CentOS 6.5 and pcre-7.8-6.el6.x86_64. In my case the failures aresegmentation violations; perhaps 7.8-4 has a different failure mode, or perhapsthere's some other minor change to your platform that causes libpcre to infloop.Either way, this appears to be a PCRE bug that grep can't be expected to workaround.


Does the attached patch cause the test to fail reliably for you, instead of 
looping?

By the way, I'm not sure why tests distinguish between require_en_utf8_locale_and require_compiled_in_MB_support; the latter requires the former, and there'sno point requiring the former unless we also require the latter.

pcre.diff
Description: Text document

[Prev in Thread]

Current Thread

[Next in Thread]

bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Shlomi Fish, 2014/10/23
- bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Paul Eggert, 2014/10/23
  - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Shlomi Fish, 2014/10/24
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Norihiro Tanaka, 2014/10/24
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Shlomi Fish, 2014/10/24
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Norihiro Tanaka, 2014/10/24
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Paul Eggert, 2014/10/24
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Norihiro Tanaka, 2014/10/24
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Shlomi Fish, 2014/10/25
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Jim Meyering, 2014/10/25
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Paul Eggert <=
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Jim Meyering, 2014/10/25
    - bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8, Paul Eggert, 2014/10/26

Prev by Date: bug#18817: \w is not synonym for [[:alnum:]] in UTF-8 locales
Next by Date: bug#18817: \w is not synonym for [[:alnum:]] in UTF-8 locales
Previous by thread: bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8
Next by thread: bug#18806: grep -rP getline crashes prematurely (without displaying all results) on invalid UTF-8 input with LC_ALL=en_US.UTF-8
Index(es):
- Date
- Thread