emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character folding in the pretest


From: Paul Eggert
Subject: Re: Character folding in the pretest
Date: Thu, 4 Feb 2016 11:25:41 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 02/04/2016 09:45 AM, Eli Zaretskii wrote:
We should instead cater to users who search text they_can_  read.

This depends on what one means by "read". I can "read" Swedish in the sense that I know where the word boundaries are and have some idea of how they're pronounced. I can also "read" Belarusian in the sense that I know Cyrillic and a bit of Russian and can follow Belarusian better than Swedish, though I easily get lost. In both cases, I'd prefer Unicode-type case folding even though it's "wrong" to ignore diacritics in the native languages.

Conversely, I can't "read" Hebrew or Chinese or Arabic in the same sense and so don't much care how folding works for those language. Perhaps some Hebrew-speaking experts want פּ and פ and ף to be treated the same while searching, while other experts do not; it doesn't matter to me.

To help provide context here, most of my reading of non-English text is to support other free projects such as the tz database. That database is mostly English but contains short passages from other languages. I use Emacs for primary database maintenance, but often use other programs to browse the Internet as they're more convenient. I'll cut and paste out of a Firefox browser between a page of interest and Google Translate, for example. Examples of text under Emacs control include "Bahía", "Lịch hai thế kỷ", "中国科技史料", and "Новый счет времени". Most of the searching for this sort of thing in Emacs will involve typing strings like "bahia" and "lich" where I almost always prefer diacritic- and case-folded search.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]