bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #19637] w and [[:alnum:]] not equivalent in multibyte locale


From: Benno Schulenberg
Subject: [bug #19637] w and [[:alnum:]] not equivalent in multibyte locale
Date: Fri, 20 Apr 2007 12:14:27 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.11) Gecko/20070327 Firefox/1.5.0.11

Follow-up Comment #1, bug #19637 (project grep):

Confirmed.  So this is two bugs:

1) The man page should not say that \w equals [[:alnum:]], as the first
includes also the underscore.
2) \w does not match accented characters in an utf8 locale.


$ export LC_ALL=nl_NL
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
 ee
 ëë
 __
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
 ee
 ëë

[switch the Konsole's encoding from iso-8859-1 to utf8 and retype the lines
instead of recalling history]

$ export LC_ALL=nl_NL.utf8
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
 ee
 __
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
 ee
 ëë


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?19637>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]