[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16581: suggested code simplification in dfa.c
From: |
Eric Blake |
Subject: |
bug#16581: suggested code simplification in dfa.c |
Date: |
Wed, 29 Jan 2014 06:49:13 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 |
On 01/29/2014 06:42 AM, Eric Blake wrote:
> Your hack is great at finding characters that have a case mapping, but
> not necessarily at finding all such characters that map to the same
> result when passed through towlower(towupper(c)).
>
In particular, note that the Java language has formalized
case-insensitive comparison as follows:
http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#equalsIgnoreCase%28java.lang.String%29
Two characters c1 and c2 are considered the same, ignoring case if at
least one of the following is true:
The two characters are the same (as compared by the == operator).
Applying the method Character.toUpperCase(char) to each character
produces the same result.
Applying the method Character.toLowerCase(char) to each character
produces the same result.
and lower down, compareToIgnoreCase():
Compares two strings lexicographically, ignoring case differences. This
method returns an integer whose sign is that of calling compareTo with
normalized versions of the strings where case differences have been
eliminated by calling
Character.toLowerCase(Character.toUpperCase(character)) on each character.
Note that this method does not take locale into account, and will result
in an unsatisfactory ordering for certain locales. The java.text package
provides collators to allow locale-sensitive ordering.
In particular, the specification was careful to require double-case
conversion, with uppercase first, in order to normalize all
single-character oddities, while still mentioning that true Unicode
collation has even more special cases that can't be decided on a
character-by-character basis.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
- bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/28
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/28
- bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/28
- bug#16581: suggested code simplification in dfa.c, Aaron Crane, 2014/01/29
- bug#16581: suggested code simplification in dfa.c, arnold, 2014/01/29
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/30
- bug#16581: suggested code simplification in dfa.c, arnold, 2014/01/30
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/30
- bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/31