bug#16581: suggested code simplification in dfa.c

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16581: suggested code simplification in dfa.c

From:	Aaron Crane
Subject:	bug#16581: suggested code simplification in dfa.c
Date:	Wed, 29 Jan 2014 14:20:10 +0000

Paul Eggert <address@hidden> wrote:
> +/* The following functions exploit the commutativity and associativity of ^,
> +   and the fact that X ^ X is zero.  POSIX requires that C equals
> +   either tolower (C) or toupper (C); if the former, then C ^ tolower (C)
> +   is zero so C ^ xor_other (C) equals toupper (C), and similarly
> +   for the latter.  */
> +
> +/* Return the exclusive-OR of C and C's other case, or zero if C is
> +   not a letter that changes case.  */
> +
> +static wint_t
> +xor_wother (wint_t c)
> +{
> +  return towlower (c) ^ towupper (c);
> +}
[…]
> +      if (case_fold)
>          {
> +          wchar_t xor = xor_wother (wc);
> +          if (xor)
> +            {
> +              addtok_wc (wc ^ xor);
> +              addtok (OR);
> +            }

I don't think this works for the wide-character case. For example, in
a suitable locale, I'd expect U+01C8 LATIN CAPITAL LETTER L WITH SMALL
LETTER J ("Lj", roughly) to be U+01C7 LATIN CAPITAL LETTER LJ ("LJ")
under towupper(), and U+01C9 LATIN SMALL LETTER LJ ("lj") under
towlower(). This matches the behaviour I can observe with a simple
test program under the en_GB.UTF-8 locale on both Linux and Mac OS.

Since 0x1c7 ^ 0x1c9 == 14, and 0x1c8 ^ 14 == 0x1c6, this means we'd
call addtok_wc(0x1c6), and U+01C6 is LATIN SMALL LETTER DZ WITH CARON,
which isn't a desired character.

-- 
Aaron Crane ** http://aaroncrane.co.uk/

[Prev in Thread]

Current Thread

[Next in Thread]

bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/28
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/28
  - bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/28
    - bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/29
    - bug#16581: suggested code simplification in dfa.c, Eric Blake, 2014/01/29
    - bug#16581: suggested code simplification in dfa.c, Eric Blake, 2014/01/29
    - bug#16581: suggested code simplification in dfa.c, Aaron Crane <=
    - bug#16581: suggested code simplification in dfa.c, arnold, 2014/01/29
    - bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/30
    - bug#16581: suggested code simplification in dfa.c, arnold, 2014/01/30
    - bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/30
    - bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/31

Prev by Date: bug#16581: suggested code simplification in dfa.c
Next by Date: bug#16581: suggested code simplification in dfa.c
Previous by thread: bug#16581: suggested code simplification in dfa.c
Next by thread: bug#16581: suggested code simplification in dfa.c
Index(es):
- Date
- Thread