emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: extending case-fold-search to remove nonspacing marks (diacritics et


From: Eli Zaretskii
Subject: Re: extending case-fold-search to remove nonspacing marks (diacritics etc.)
Date: Fri, 06 Feb 2015 09:35:24 +0200

> Date: Thu, 5 Feb 2015 23:17:42 +0000
> From: Artur Malabarba <address@hidden>
> 
> As for answering your questions:
> 
> >> implementing it for users so it works like `case-fold-search' (you just
> >> set something in Customize and all search commands DWYM) seems much
> >> harder.
> 
> Doing it as part of Emacs is not terribly hard, but it has
> disadvantages. Namely, the case-fold-search machinery only relates one
> character to another character (1 to 1). At least for latin this would
> be enough a lot of the time, e.g. you can use it to relate "á" to "a".
> However, there's another way of writing "á" which takes two
> characters, and this situation can't be handled (AFAIK) by the
> case-fold-search machinery.

This just means you cannot implement that without changes to the C
level.  Changing the C code to lift the one-character restriction is
not very hard.

> The bright side is that I think this two-char way of writing latin
> accents is much less common (not 100% sure though, it's hard to tell
> the difference). The downside is that I know nothing about other
> languages, so maybe using two chars to represent one char is the
> default behavior in some other languages?

It can be more than 2 characters, e.g. in scripts that use diacritics:
there could be more than diacritic combined with one base character.

And then there are characters to be ignored, like ZWJ and bidi
directional controls.

So I think ad-hoc rules like the above is not going to cut it.  We
must use the decomposed forms, whatever they are, and we should also
consult the character properties to ignore the ignorables.

> >> Does anyone have suggestions? Maybe some defadvice magic?
> 
> You can use a defadvice around one of the isearch internal functions
> (check out the branch I mentioned) to implement something in elisp.
> And you can redefine the buffer's case-folding table and use that in
> the advice, but that will require that you generate the entire table.

Please don't kludge around the problem.  If it is important enough for
you to solve it, let's solve it as God intended.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]