bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x)

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x)

From:	Pádraig Brady
Subject:	bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Date:	Sat, 11 Jan 2014 11:33:47 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 01/11/2014 05:40 AM, Jim Meyering wrote:
> On Fri, Jan 10, 2014 at 8:52 PM, Jim Meyering <address@hidden> wrote:
>>> I wonder might this faster path be restricted to a safer but very common 
>>> input subset of:
>>>
>>> (MB_CUR_MAX == 1 || (in_utf8 && *c < 0x80))
>>
>> That sounds like a good approach.
>> Now I need another test case, to demonstrate that the current code can
>> cause trouble.
> 
> Hmm... after thinking about this for a while and actually trying to
> break the current code (did not find a way to demonstrate a regression),
> I have concluded that the current approach is no worse than the prior
> one of matching a case-mapped regexp vs. each case-mapped input line.
> 
> That's not to say that it's perfect, of course.
> The "LATIN SMALL LETTER J WITH CARON, COMBINING DOT BELOW" example
> from gnulib's test-ulc-casecmp.c is a great example: this matches:
> 
>     printf '\x6A\xCC\x8C\xCC\xA3\n'|src/grep -i "$(printf
> '\x6A\xCC\x8C\xCC\xA3')"
> 
> but this does not, yet probably should:
> 
>     printf '\xC7\xB0\xCC\xA3\n'|src/grep -i "$(printf '\x6A\xCC\x8C\xCC\xA3')"
> 
> Can you see a way to demonstrate a regression?

Oh right, it doesn't handle these cases already.
Fair enough I don't see a regression then.

+1

Pádraig.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/07
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/10
- bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/10
  - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/10
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady <=
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Jim Meyering, 2014/01/11
    - bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales, Pádraig Brady, 2014/01/12

Prev by Date: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Next by Date: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Previous by thread: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Next by thread: bug#16232: [PATCH] grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
Index(es):
- Date
- Thread