[debbugs-tracker] bug#22103: closed ([PATCH] grep: improve performance f

emacs-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#22103: closed ([PATCH] grep: improve performance f

From:	GNU bug Tracking System
Subject:	[debbugs-tracker] bug#22103: closed ([PATCH] grep: improve performance for grep -P in UTF-8)
Date:	Fri, 08 Jan 2016 13:47:02 +0000

Your message dated Fri, 08 Jan 2016 22:46:33 +0900
with message-id <address@hidden>
and subject line Re: bug#20526: grep BUG: text file is detected as binary
has caused the debbugs.gnu.org bug report #22103,
regarding [PATCH] grep: improve performance for grep -P in UTF-8
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
22103: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=22103
GNU Bug Tracking System
Contact address@hidden with problems

--- Begin Message --- Subject: [PATCH] grep: improve performance for grep -P in UTF-8 Date: Mon, 07 Dec 2015 08:01:23 +0900

After grep -P found first match, TEXTBIN_UNKNOWN optimizations is not
used.  Therefore, if grep -P found early match, grep -P is very slow in
UTF-8.

  $ time -p grep -P ^1$ <(seq 999999)
  1
  real 14.55
  user 13.77
  sys 1.12

Or grep -Pa is not used TEXTBIN_UNKNOWN optimizations.  Therefere, it is
also very slow in UTF-8.

grep -P ^1$ <(seq 999999)

  $ time -p grep -Pa a <(seq 999999)
  real 14.53
  user 13.65
  sys 1.35

This change makes deference to leave TEXTBIN_UNKNOWN optimizations until
grep -P finds a binary character.

It will bring more than 10x speed up.

  $ time -p src/grep -P ^1$ <(seq 999999)
  1
  real 0.97
  user 0.79
  sys 0.24

  $ time -p src/grep -Pa a <(seq 999999)
  real 0.98
  user 0.23
  sys 0.99

BTW, this change conflicts with proposal in bug#22028.

0001-grep-improve-performance-for-grep-P-in-UTF-8.patch
Description: Text document

--- End Message ---

--- Begin Message --- Subject: Re: bug#20526: grep BUG: text file is detected as binary Date: Fri, 08 Jan 2016 22:46:33 +0900

On Wed, 6 Jan 2016 09:57:46 -0800
Paul Eggert <address@hidden> wrote:

> On 01/06/2016 12:32 AM, Paul Eggert wrote:
> > I installed the attached patch, which fixed this performance bug for me. 
> Whoops! I forgot to 'git add src/search.h' before committing. We also need 
> the attached followup patch, which I installed.

Great!   Thanks, many issues including for output of invalid sequence
are fixed by your patches.  bug#22103 is also fixed in them, so I am
closing it.

--- End Message ---

[Prev in Thread]

Current Thread

[Next in Thread]

[debbugs-tracker] bug#22103: closed ([PATCH] grep: improve performance for grep -P in UTF-8), GNU bug Tracking System <=

Prev by Date: [debbugs-tracker] bug#6273: closed (dired-current-directory with `t' arg returns empty string in top-level mount-point "/")
Next by Date: [debbugs-tracker] Processed: control message for bug #22200
Previous by thread: [debbugs-tracker] bug#6273: closed (dired-current-directory with `t' arg returns empty string in top-level mount-point "/")
Next by thread: [debbugs-tracker] Processed: control message for bug #22200
Index(es):
- Date
- Thread