bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13041: 24.2; diacritic-fold-search


From: martin rudalics
Subject: bug#13041: 24.2; diacritic-fold-search
Date: Sun, 09 Dec 2012 18:52:17 +0100

> OTOH, instead of using an approach of matching only a full match
> like in Chromium, we could do like GEdit and OpenOffice that
> match the whole ligature character in a partial match
> (i.e. to match "ff" when the search string is just "f").

Strictly spoken, they should match the first "f" in "ff".  When matching
"suf" against "suffer", the `match-string' would be "suf", with
`match-end' after "ff".  That is, the match length would not increase
when adding an "f" to the search string now.  But I don't know what
`match-string' should return - "suff" or "suff".

> Though this has a problem of highlighting the whole character for
> a partial match that looks wrong, but perhaps no one can do better.

We needed a display string "ff" replacing "ff" during highlighting and
highlight only the first "f" in it.

> Yes, this is what I meant too.  It is surprising but
> http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
> defines the equivalence of "ß" and "ss" (lower case "s")
> instead of case-folding.  The following line in CaseFolding.txt:
>
> 00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S
>
> maps 00DF (LATIN SMALL LETTER SHARP S) to two characters
> 0073 0073 (LATIN SMALL LETTER S) keeping the lower case.
> Maybe this is a bug in Unicode data?

Maybe it's explained here

  http://www.unicode.org/faq/idn.html

in the answer to

  Q: Why does IDNA2003 map final sigma (ς) to sigma (σ), map eszett (ß)
  to "ss", and delete ZWJ/ZWNJ?

One possible interpretation of this is that mapping "ß" to "SS" would
imply that downcasing "SS" should produce "ß" and this is unwanted.  But
I still wonder whether we are supposed to apply mappings recursively.

martin






reply via email to

[Prev in Thread] Current Thread [Next in Thread]