Implementing charset-aware X font names [was: Cyrillic vs UTF-8]

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Implementing charset-aware X font names [was: Cyrillic vs UTF-8]

From:	Stephen J. Turnbull
Subject:	Implementing charset-aware X font names [was: Cyrillic vs UTF-8]
Date:	Sat, 26 Apr 2003 12:55:49 +0900
User-agent:	Gnus/5.090016 (Oort Gnus v0.16) XEmacs/21.5 (cabbage)

>>>>> "Simon" == Simon Josefsson <address@hidden> writes:

    PROBLEMS> * Characters from the mule-unicode charsets aren't
    PROBLEMS> displayed under X.

    PROBLEMS> XFree86 4 contains many fonts in iso10646-1 encoding
    PROBLEMS> which have minimal character repertoires (whereas the
    PROBLEMS> encoding is meant to be a reasonable indication of the
    PROBLEMS> repertoire).

*sigh*  "iso10646" is not meant to be an indication of repertoire.
See section 13 of the ISO 10646 standard.  It's intended to fix the
ISO 8859 ambiguity.  There is a deficiency in XFree86, but it's not
that the fonts are incomplete (note the word "implicit" in the XLFD
standard, that refers to current national encoding practice at
definition time, not to UCSes); that's gonna happen.  Why should a
Russian font designer provide Thai glyphs?  And what Thai in her right
mind would prefer those over native-designed fonts (without looking at
them)?

Instead, the font names and properties should provide encoding range
specifications instead of the useless "1" (which in ISO 10646-1 is not
an encoding specification, really).  As a first take, I think a
reasonable way to do this would be to specify that for the iso10646
registry the encoding field of an XLFD name should contain a
comma-separated list of Unicode block names, or a comma-separated list
of hex ranges xxxx..yyyy (can't use hyphens for the ranges,
obviously).  As long as the XLFD is otherwise fully-qualified (ie,
contains 14 hyphens), the block name format allows you to query with
"-*-*-*-*-*-*-*-*-*-*-*-*-iso10646-*CYRILLIC*" and guarantee sane
results.  Mostly "*-iso10646-*CYRILLIC*" should work OK, too.  With
the hex range format, the app has to work harder, querying with
"-*-*-*-*-*-*-*-*-*-*-*-*-iso10646-*" and checking for the ranges it
needs.  IIRC, since the actual font loaded is known to the server, you
could even have multiple such aliases, one for each block, and with
languages using multiple blocks (basically, all of them, since
everybody uses ASCII), you'd just want to be careful to query for the
"rare" blocks first.

This would also allow Emacs and other smart apps to create virtual
fonts (ie, in faces) by requesting Ryumin Light for the Han and Kana
blocks and Times-Roman for the Basic Latin and Latin-1 Supplement
blocks, as an alternative to X Font Sets.  (This would be nearly
trivial to implement in XEmacs since we use specifiers to implement
faces, and specifiers already do magic to connect charsets to font
registries.  I suppose it would be more work in GNU Emacs, but I
haven't looked at Emacs's font set code.)

Does this look like something reasonable for Emacs (and XEmacs) to
implement on the client side?  If so, I'll play with it a bit (note
that implementing this server-side is simply a matter of editing
fonts.aliases) and then put it in play with the X11 and XFree86
people.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Cyrillic vs UTF-8, (continued)
- Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/25
  - Implementing charset-aware X font names [was: Cyrillic vs UTF-8], Stephen J. Turnbull <=
    - Re: Implementing charset-aware X font names [was: Cyrillic vs UTF-8], Kenichi Handa, 2003/04/28
    - Re: Implementing charset-aware X font names, Stephen J. Turnbull, 2003/04/28
  - Re: Cyrillic vs UTF-8, Kenichi Handa, 2003/04/26
    - Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/26

Prev by Date: Re: Gnu Emacs way slower than XEmacs
Next by Date: Re: More Cyrillic vs UTF-8
Previous by thread: Re: Cyrillic vs UTF-8
Next by thread: Re: Implementing charset-aware X font names [was: Cyrillic vs UTF-8]
Index(es):
- Date
- Thread