emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Q: something like autoload for coding-systems?


From: Kenichi Handa
Subject: Re: Q: something like autoload for coding-systems?
Date: Wed, 14 Nov 2001 15:31:14 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1.30 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

Richard Stallman <address@hidden> writes:
>                 I've just checked how big the interval list
>     will be for GB2312 charset.  I tried to make a vector of the
>     forms [ (FROM-CHAR . TO-CHAR) ... ], and to optimize, if
>     FROM-CHAR == TO-CHAR, put FROM-CHAR in a element instead of
>     cons.  Provided that each number consumes one word, and each
>     cons consumes three words, the vector roughly consumes 6300
>     words.  It's about 25K-byte.  It is surely smaller than the
>     whole mapping table.

> Can these codes fit in 16 bits?

No.

> Depending on the patterns of coverage, other data types such as
> bitmaps and run-length-encodings of bitmaps could provide more dense
> representations, especially if they can be used for just parts of the
> code space.

It uses less space, but, isn't it time-consuming to check if
a specific character is included or not with such an format?

> Another idea: instead of one table for each coding system, have a
> common table for characters covered by all the Chinese coding systems,
> and then a table of extra characters for each one.  Maybe this
> would save further space.

No.  I checked it for Big5, GB2312, and CNS.  Less than 10%
of chinese characters belongs to all of them, even if we
take care only BMP of Unicode.

---
Ken'ichi HANDA
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]