bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#21486: [PATCH 3/3] dfa: cache transition from a state with dot expre


From: Norihiro Tanaka
Subject: bug#21486: [PATCH 3/3] dfa: cache transition from a state with dot expression in non-UTF8 multibyte locales
Date: Tue, 15 Sep 2015 22:46:59 +0900

For many patterns, matching in non-UTF8 multibyte locales is as fast as
in single byte du to many improvement.  However, for patterns to begin
with one or more than period, it is still slow.  This change improves
performance for them.

Compare the run times of this command before and after this change:
(on a i5-4570 CPU @ 3.20GHz using rawhide (~fedora 22) and compiled
with gcc 5.1.1 20150618)
yes "$(printf 'a%38db\n' 0)" | head -1000000 >in

env LC_ALL=ja_JP.eucJP time -p src/grep '.a.b' in
  Before: 2.33
   After: 0.69

env LC_ALL=ja_JP.eucJP time -p src/grep ^..............................$ in
  Before: 2.35
  After : 0.45

$ env LC_ALL=ja_JP.eucJP time -p src/grep 
.......................................... in
  Before: 19.10
  After :  0.55

By the way, before applying this patch, two patches in bug#21266 needs
to be applied.

Attachment: 0003-dfa-cache-transition-from-a-state-with-dot-expressio.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]