[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#23752: [PATCH] grep: try fgrep matcher for case insensitive matching
From: |
Norihiro Tanaka |
Subject: |
bug#23752: [PATCH] grep: try fgrep matcher for case insensitive matching by grep -F in multibyte locale |
Date: |
Sun, 12 Jun 2016 18:47:58 +0900 |
In grep 2.19 or later, grep -F use grep matcher for case insensitive
matching in multibyte locale. However, it causes poor performance for a
long pattern bacause of building DFA.
By this patch, in multibyte locale, if a pattern is composed of only
single byte characters and their all counterparts are also single byte
characters and the pattern does not have invalid sequences, grep -F uses
fgrep matcher same as single byte locale.
It fixes bug#21763 and bug#22239 partially.
$ seq -f '%g bottles of beer on the wall' 1 600 >pat
$ tr a-z A-Z <pat >in
(before)
$ time -p env LC_ALL=C src/grep -Fivf pat in
real 0.08
user 0.03
sys 0.03
$ time -p env LC_ALL=ja_JP.eucjp src/grep -Fivf pat in
real 104.84
user 93.32
sys 3.28
(after)
$ time -p env LC_ALL=C src/grep -Fivf pat in
real 0.09
user 0.03
sys 0.04
$ time -p env LC_ALL=ja_JP.eucjp src/grep -Fivf pat in
real 0.08
user 0.03
sys 0.03
If a pattern has any multibyte character, grep -F is still slow.
$ printf '\xb3\xa4\n' >>pat
$ time -p env LC_ALL=ja_JP.eucjp src/grep -Fivf pat in
real 103.38
user 93.81
sys 2.46
0001-grep-try-fgrep-matcher-for-case-insensitive-matching.patch
Description: Text document
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#23752: [PATCH] grep: try fgrep matcher for case insensitive matching by grep -F in multibyte locale,
Norihiro Tanaka <=