Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8

From:	Paul Eggert
Subject:	Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8
Date:	Fri, 15 Jun 2012 09:41:31 -0700
User-agent:	Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120430 Thunderbird/12.0.1

Ah, OK, thanks, I see what you're saying now.  Clearly grep does not
behave the way you're asking for.  I don't think POSIX requires that
behavior either.  The relevant part of POSIX says that for -i

  not only the character, but also its case counterpart (if any),
  shall be matched

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_02

which indicates that the POSIX folks did not consider examples such as
Greek sigma, where a single letter can have multiple case
counterparts.

Even if POSIX doesn't specify the behavior, it might be nice for
GNU grep to do it anyway, if someone could take the time to
implement it without undue performance loss.  You're welcome to
file a bug report on the Savannah bug tracker about this.
<http://savannah.gnu.org/bugs/?group=grep>

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8, (continued)

Prev by Date: Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8
Next by Date: Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8
Previous by thread: Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8
Next by thread: Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8
Index(es):
- Date
- Thread