bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#19173: [PATCH] dfa: speed-up for long pattern


From: Jim Meyering
Subject: bug#19173: [PATCH] dfa: speed-up for long pattern
Date: Sat, 4 Jul 2015 20:40:05 -0700

On Mon, Nov 24, 2014 at 5:15 PM, Norihiro Tanaka <address@hidden> wrote:
> DFA trys to find a long sequence of characters that must appear in any
> line containing the r.e. in dfamust()  However, if a pattern is long, it
> is very slow, as it processes all characters step by step.  This change
> makes a string concatenated some normal characters process at a time.
>
> Following test case is posted in bug#15191.  It speeds-up more than 60x.
>
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15191#5
>
> << before changed >>
>   $ time -p grep -f regex.re input_lines.txt
>   real 1.28
>   user 1.27
>   sys 0.00
>
> << after changed >>
>   $ time -p grep -f regex.re input_lines.txt
>   real 0.02
>   user 0.01
>   sys 0.00

Thank you for that patch.
I have rebased it and made some small improvements:
I combined an if+do loop into a single for-loop and moved
some declarations "down". I constructed a reproducer that
does not require two large inputs to demonstrate the
performance improvement.

I've also begun to reword the commit log, but am out of time
for this evening, so will post this here, for now.

I am still trying to convince myself that this is a strict
improvement, i.e., that the O(N^2) strstr calls avoided
by this change served no purpose.

Attachment: 0001-dfa-speed-up-handling-of-long-pattern.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]