[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #28275] Ranges like [a-z] incorrectly match in UTF systems
From: |
Makar |
Subject: |
[bug #28275] Ranges like [a-z] incorrectly match in UTF systems |
Date: |
Sun, 13 Dec 2009 14:06:06 +0000 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; ru-RU; rv:1.9.1.5) Gecko/20091129 Sabayon Firefox/3.5.5 |
URL:
<http://savannah.gnu.org/bugs/?28275>
Summary: Ranges like [a-z] incorrectly match in UTF systems
Project: grep
Submitted by: tkzv
Submitted on: Вск 13 Дек 2009 14:06:05
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
In UTF-8 locale if basic or extended regular expressions are selected, ranges
like [a-z] or [а-я] seem to match much more symbols, than they should.
Simply enumerating all the symbols, e.g. [abcdefghijklmnopqrstuvwxyz] or
[абвгдеёжзийклмнопрстуфхцчшщъыьэюя] works
fine.
If perl regular expressions are selected (-P switch), ranges with ASCII-only
symbols like [a-z] work correctly, but multibyte (both ranges and enumeration)
symbols are interpreted as several 1-byte symbols.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?28275>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems,
Makar <=
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Paolo Bonzini, 2009/12/14
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar, 2009/12/14
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Paolo Bonzini, 2009/12/14
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar, 2009/12/14
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Paolo Bonzini, 2009/12/14
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Norihirio Tanaka, 2009/12/16
- [bug #28275] Ranges like [a-z] incorrectly match in UTF systems, Makar, 2009/12/21
- [bug #28275] grep -P should use PCRE_UTF8, Paolo Bonzini, 2009/12/22