[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16927: [PATCH] grep: avoid to add same character to a bracket expres
From: |
Norihiro Tanaka |
Subject: |
bug#16927: [PATCH] grep: avoid to add same character to a bracket expression |
Date: |
Mon, 03 Mar 2014 22:13:00 +0900 |
Package: grep
Tags: patch
The patch avoids to add same character to a bracket expression in
trivial_case_ignore. That may be able to generate smaller tokens in
multibyte locales.
For example, FULLWIDTH LATIN CAPITAL LETTER A (ef bd 81) will transform
as below, because multibyte characters in CSET is extended to OR
expressions in DFA.
Before the patch:
[AAa] (where each charactecter is fullwidth)
EF BD CAT 81 CAT EF BD CAT 81 CAT OR EF BC CAT A1 CAT OR
After the patch:
[Aa] (where each charactecter is fullwidth)
EF BD CAT 81 CAT EF BC CAT A1 CAT OR
patch.txt
Description: Binary data
- bug#16927: [PATCH] grep: avoid to add same character to a bracket expression,
Norihiro Tanaka <=