[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #19637] w and [[:alnum:]] not equivalent in multibyte locale
From: |
Benno Schulenberg |
Subject: |
[bug #19637] w and [[:alnum:]] not equivalent in multibyte locale |
Date: |
Fri, 20 Apr 2007 12:14:27 +0000 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.11) Gecko/20070327 Firefox/1.5.0.11 |
Follow-up Comment #1, bug #19637 (project grep):
Confirmed. So this is two bugs:
1) The man page should not say that \w equals [[:alnum:]], as the first
includes also the underscore.
2) \w does not match accented characters in an utf8 locale.
$ export LC_ALL=nl_NL
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
ee
ëë
__
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
ee
ëë
[switch the Konsole's encoding from iso-8859-1 to utf8 and retype the lines
instead of recalling history]
$ export LC_ALL=nl_NL.utf8
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
ee
__
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
ee
ëë
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?19637>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/