help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 in path / filename


From: James Cloos
Subject: Re: UTF-8 in path / filename
Date: Mon, 28 Aug 2006 11:11:14 -0400
User-agent: Gnus/5.110006 (No Gnus v0.6) Emacs/23.0.0 (gnu/linux)

JimC> Doesn't apple by default use NFD (Normalizaion Form Decomposed)
JimC> for filenames?  That would explain the <vowel><box> sequences.

Peter> Yes, that's the correct term for the way file names are
Peter> recorded in HFS+.

So then the problem is narrowed to support for composition.

I just gave it a test, running the unicode-2 branch on a linux box,
using the en_US-UTF8 locale.

I copied the filename you quoted (äöüæÆÜÖÄ.txt), gave it a prefix to
ease globbing (resulting in /tmp/xxx-äöüæÆÜÖÄ.txt), and ran find-file
on /tmp.  It worked correctly.  (Well, almost; the glyphs composed by
emacs have twice the height of pre-composed glyphs.  There was a time
when emacs didn't do that, but it is doing it again.  Including in
this buffer.  But that looks to be specific to --enable-font-backend
and DejaVu Sans Mono.  With other fonts I do not get visible accents,
even though C-u C-x = claims it is composing.  And without --e-f-b I
get composed glyphs which have correct vertical metrics.)

I also tested this:

  :; echo /tmp/xxx-a*

and got the filename, showing that bash treats the code points as
separate characters when globbing.  (Which also means I didn't
actually need the xxx- prefix, since a* will therefore match the
original filename....)

So.  Does C-u C-x = claim to be composing for you?

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 0xED7DAEA6




reply via email to

[Prev in Thread] Current Thread [Next in Thread]