bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales


From: Jim Meyering
Subject: bug#18454: Improve performance when -P (PCRE) is used in UTF-8 locales
Date: Fri, 19 Sep 2014 09:06:47 -0700

On Thu, Sep 18, 2014 at 12:36 PM, Jim Meyering <address@hidden> wrote:
> It looks like most of the difference is the result of
> commit cd36abd46c5e0768606979ea75a51732062f5624,
> "grep: treat a file as binary if its prefix contains encoding errors",

Hi Paul,

I found that the above commit induces a large performance hit.
Over 50x in this example:

  seq 99999999 > k
  LC_ALL=C diff -u \
    <(PATH=.bin/2.20-31:$PATH env time -f %e grep asdf k 2>&1) \
    <(PATH=.bin/2.20-32:$PATH env time -f %e grep asdf k 2>&1)
  ...
  -0.21
  +11.47

The problem is that the new function is processing all of
the input, not just a prefix.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]