bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16919: [PATCH] fix mismatch between dfa and regex for treatment of t


From: Norihiro TANAKA
Subject: bug#16919: [PATCH] fix mismatch between dfa and regex for treatment of titlecase
Date: Wed, 05 Mar 2014 23:11:40 +0900

Hi Paul,

Thanks for a lot of investigation.  I have understood that we cannot
generally tell whether DFA's or regex's behavior is right.

I have tested the behavior of sereral regex engines.  What's interesting
is that most of results differ from others.  And nobody will understand
which is right.

--
GNU grep (DFA):

$ env LANG=en_US.utf8 ./test.sh "src/grep -i" 2>/dev/null | nl -ba
     1   c7 87 | c7 89
     2   c7 87 | c7 88 | c7 89
     3   c7 87 | c7 89
     4   49 | 69
     5   49 | 69
     6   69 | c4 b0
     7   49 | c4 b1

GNU grep (regex):

$ env LANG=en_US.utf8 ./test.sh "src/grep -i" '\(\)\1' 2>/dev/null | nl -ba
     1   c7 87 | c7 88 | c7 89
     2   c7 87 | c7 88 | c7 89
     3   c7 87 | c7 88 | c7 89
     4   49 | 69 | c4 b1
     5   49 | 69 | c4 b1
     6   c4 b0
     7   49 | 69 | c4 b1

pcregrep:

$ env LANG=en_US.utf8 ./test.sh "pcregrep -iu" 2>/dev/null | nl -ba
     1   c7 87 | c7 88 | c7 89
     2   c7 87 | c7 88 | c7 89
     3   c7 87 | c7 88 | c7 89
     4   49 | 69
     5   49 | 69
     6   c4 b0
     7   c4 b1

Solaris grep (xpg4):

$ env LANG=ja_JP.UTF-8 ./test.sh  "/usr/xpg4/bin/grep -i" 2>/dev/null | nl -ba
     1           c7 87 | c7 89
     2           c7 88
     3           c7 87 | c7 89
     4           49 | 69
     5           49 | 69
     6           error
     7           error

HP-UX grep:

$ env LANG=en_US.utf8 ./test.sh  "/bin/grep -i" 2>/dev/null | nl -ba
     1              c7    87     |    c7    88     |    c7    89
     2              c7    87     |    c7    88     |    c7    89
     3              c7    87     |    c7    88     |    c7    89
     4              49     |    69
     5              49     |    69
     6              c4    b0
     7              c4    b1
--

Norihiro

Attachment: test.sh
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]