guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: uc_tolower (uc_toupper (x))


From: Mike Gran
Subject: Re: uc_tolower (uc_toupper (x))
Date: Thu, 10 Mar 2011 16:54:41 -0800 (PST)

> From:Mark H Weaver <address@hidden>
> To:address@hidden
> Cc:
> Sent:Thursday, March 10, 2011 3:39 PM
> Subject:uc_tolower (uc_toupper (x))
> 
> I've noticed that srfi-13.c very frequently does:
> 
>   uc_tolower (uc_toupper (x))
> 
> Is there a good reason to do this instead of:
> 
>   uc_tolower (x)

Unicode defines a case folding algorithm as well as
a data table for case insensitive sorting.  Setting
things to lowercase is a decent approximation of
case folding.  But doing the upper->lower operation picks
up a few more of the corner cases, like U+03C2 GREEK
SMALL LETTER FINAL SIGMA and U+03C3 GREEK SMALL LETTER SIGMA
which are the same letter with different representations,
or U+00B5 MICRO SIGN and U+039C GREEK SMALL LETTER MU
which are supposed to have the same sort ordering.

Now that we've pulled in all of libunistring, it might
be a good idea to see if it has a complete implementation
of unicode case folding, because upper->lower is also not
completely correct.

-Mike



reply via email to

[Prev in Thread] Current Thread [Next in Thread]