emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What exactly is chinese-big5?


From: Kenichi Handa
Subject: Re: What exactly is chinese-big5?
Date: Fri, 18 Apr 2008 20:28:08 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)

In article <address@hidden>, Eli Zaretskii <address@hidden> writes:

> > In Emacs 22, you can read the written file by utf-8 and
> > search for U+FFFD.

> Is U+FFFD the _only_ character that will be produced for any codepoint
> that is unassigned in the Big5 code space?  That is, if I search for
> U+FFFD, will I find _all_ the places where the original file had
> something not belonging to Big5?

No exactly.  U+FFFD is the only character that will be
produced for "any character that can't be unified with
Unicode".  Which Big5 character can unified with Unicode is
defined in subst-big5.el in Emacs 22 (I don't know which
Big5 version Dave used to make that file) and in
etc/charsets/BIG5.map in Emacs 23.  So, if the dialect of
Big5 is different from what defined in those files, there's
a possibility that some character which the file creater
thinks Big5 is encoded into U+FFFD.

> Also, assuming that I find one or more invalid characters, is there
> some encoding other than chinese-big5 that I should try, which could
> explain those problematic characters, besides those I mentioned in my
> original message?  This file came from Chinese speaking people, so
> there's little doubt it should include only strings that can be read
> by Chinese speakers.  Therefore, I wonder how come it does not
> translate cleanly into Unicode.  (I cannot ask the people who produced
> the file about these issues, since they seem to be pretty ignorant
> about that: they claimed the file was in UTF-8...)

That file may be GBK whose code-space is similar to but
wider than Big5.  But, it's supported only in Emacs 23.

---
Kenichi Handa
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]