[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to convert char from Emacs-20 internal to UTF-8?
From: |
Adrian Robert |
Subject: |
Re: How to convert char from Emacs-20 internal to UTF-8? |
Date: |
Tue, 22 Mar 2005 12:30:26 -0500 |
On Mar 16, 2005, at 12:19 PM, Stefan Monnier wrote:
I apologize for the "retro" question, but I was wondering if there
was an
easy way to convert a character in the Emacs-20 internal 19-bit
encoding
(from FAST_GLYPH_CHAR(glyph)) to UTF-8 (preferable) or straight
Unicode.
I'd like to do it fully within C if possible, and it needs to be
efficient.
I found a way to do this using parts of the C program available at:
http://tclab.kaist.ac.kr/~otfried/Mule/
Basically it uses a large table to convert from charset/byte1/byte2 to
unicode then UTF-8. I call SPLIT_NON_ASCII_CHAR() to get that info out
of the 19-bit internal representation stored in the glyph. CCL was not
needed, though maybe it would have provided a more compact way to solve
the problem than a 250K table.
However, I still have an issue: for 2-byte characters, such as Big5 or
JIS Chinese characters, emacs (20) is giving me two glyphs for each
character, with identical values. Does this have something to do
with it thinking the font needs a double wide horizontal space to
render the character?
thanks,
Adrian