help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: File name completion on Mac OS X with German umlauts


From: Piet van Oostrum
Subject: Re: File name completion on Mac OS X with German umlauts
Date: Mon, 17 Mar 2008 15:07:58 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1.92 (darwin)

>>>>> Nikolaj Schumacher <n_schumacher@web.de> (NS) wrote:

>NS> Eli Zaretskii <eliz@gnu.org> wrote:
>>> It is only a ``problem'' if you accept the view that no two files in
>>> the same directory can have names that are pronounced identically.

>NS> No, it's not just that.
>NS> Certainly, you could have files "X" and "Ⅹ" (the Roman numeral).  Even
>NS> if they look the same there is no problem (other than likely user
>NS> confusion) in having both.

>NS> However, the two types of "ü" are the same character, or at least
>NS> functionally equivalent characters.  They should be considered equal.
>NS> But comparing them properly requires normalization
>NS> (cf. http://www.unicode.org/unicode/reports/tr15/).

>NS> OSX does normalization in its file system.  GNU/Linux apparently does not.

Right. In GNU/Linux a filename is just a sequence of bytes without
interpretation. The interpretation is done by the programs. Nowadays modern
GNU/Linux systems tend to use UTF-8 as the default interpretation. But if
you would mount a filesystem from such a system on another one that has
Latin-1 as the preferred encoding, your filenames with non-ASCII characters
would look weird. And also a filename with ü as unnormalized UTF-8 and
another one with ü as normalised UTF-8 would be different files, but in an
ls listing they would look identical (the filenames, not the files). And I
guess the normalised one would not complete on ü, but ir will on u, and the
other one just the opposite.

On Mac OS X, however, the interpretation of the filenames as UTF-8 is part
of the filesystem. It will only use the normalized version, even when you
use the unnormalized in a system call. So you can't have both in the
filesystem. However, Emacs only uses the unnormalized version when you
enter characters in the normal way, and therefore the completion fails. For
it to succeed Emacs would have to do the normalization first (there are OS
functions for this.

>NS> Emacs must also be doing some normalization... switch-to-buffer
>NS> completion works on "rückerstattung" after all.  Only `read-file-name'
>NS> doesn't.  Hmm, maybe this /is/ an Emacs bug after all.

No, it doesn't do normalization. For buffers it is the same as for
filenames. But usually you don't have normalized buffer names (except for
those where normalized is the same as unnormalized of course). When you
create a file with name rückerstattung on OS X and open it from a directory
listing (where it shows as rückerstattung) you get a buffer name
rückerstattung. This will not complete from rü.
-- 
Piet van Oostrum <piet@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet@vanoostrum.org


reply via email to

[Prev in Thread] Current Thread [Next in Thread]