help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More unicode blocks?


From: Kevin Rodgers
Subject: Re: More unicode blocks?
Date: Wed, 28 Sep 2005 12:22:35 -0600
User-agent: Mozilla Thunderbird 0.9 (X11/20041105)

Shaddy Baddah wrote:
> Today, I finally did what I had resolved to do some time ago. I delved
> into emacs's unicode support facilities.
>
> I am a little disappointed, because it has become apparent that the
> unicode character set support is limited to 3 specific blocks of the
> full unicode character set, those being the blocks that start and end at
> the indexes expressed in mule-unicode-0100-24ff, mule-unicode-2500-33ff
> and mule-unicode-e000-ffff.
>
> The blocks that I am interested in are the CJK Unified Ideographs blocks
> , that start at unicode index 0x4E00. Specifically, the characters that
> are shared by the character set encoded via the big5 encoding scheme.

Perhaps you should try Emacs 22 (aka CVS Emacs).  Here are some items
from its etc/NEWS file:

---
*** The utf-8/16 coding systems have been enhanced.
By default, untranslatable utf-8 sequences are simply composed into
single quasi-characters.  User option `utf-translate-cjk-mode' (it is
turned on by default) arranges to translate many utf-8 CJK character
sequences into real Emacs characters in a similar way to the Mule-UCS
system.  As this loads a fairly big data on demand, people who are not
interested in CJK characters may want to customize it to nil.
You can augment/amend the CJK translation via hash tables
`ucs-mule-cjk-to-unicode' and `ucs-unicode-to-mule-cjk'.  The utf-8
coding system now also encodes characters from most of Emacs's
one-dimensional internal charsets, specifically the ISO-8859 ones.
The utf-16 coding system is affected similarly.

---
*** A new coding system `euc-tw' has been added for traditional Chinese
in CNS encoding; it accepts both Big 5 and CNS as input; on saving,
Big 5 is then converted to CNS.

---
*** New variable `utf-translate-cjk-unicode-range' controls which
Unicode characters to translate in `utf-translate-cjk-mode'.

---
*** iso-10646-1 (`Unicode') fonts can be used to display any range of
characters encodable by the utf-8 coding system.  Just specify the
fontset appropriately.

> I have no problems displaying and editing these characters under the
> big5 coding scheme, so they are obviously well supported by emacs (and
> it's internal coding scheme, right?).
>
> So, what is the impediment, or perhaps rationale, behind the lack of
> support for the additional unicode blocks at this stage of Emacs
> development?
>
> Is it simply to do with someone having to implement some type of
> character translation tables, or is there/how much more is there to it?

Sorry, I don't know the answers to those questions.

--
Kevin Rodgers





reply via email to

[Prev in Thread] Current Thread [Next in Thread]