bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17013: [PATCH] grep: optimization by using the Galil rule for Boyer-


From: Norihiro Tanaka
Subject: bug#17013: [PATCH] grep: optimization by using the Galil rule for Boyer-Moore algorithm in KWSet
Date: Fri, 14 Mar 2014 21:21:21 +0900

Package: grep
Tags: patch

The Boyer-Moore algorithm runs in O(m n) in the worst case,
 which perhaps it may be much slower than the DFA.

The Galil rule enables to change O(m n) into O(n) for its case without
overheads and/or slow-down for other cases by avoiding to compare more
than once for a position in the text.  This patch implements it.

I prepare following string, which makes a worst case for Boyer-Moore
algorithm, to measure the performance.

    yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 > ../k

I run the test with the patch (best-of-5 trials):

    env LC_ALL=C time -p src/grep kjjjjjjjjjjjjjjjjjjj k
        real 0.70       user 0.32       sys 0.38

Back out that commit (temporarily), recompile, and rerun the experiment:

    env LC_ALL=C time -p src/grep kjjjjjjjjjjjjjjjjjjj k
        real 3.97       user 3.56       sys 0.40

Norihiro

Attachment: grep.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]