emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cyrillic vs UTF-8


From: Kenichi Handa
Subject: Re: Cyrillic vs UTF-8
Date: Mon, 28 Apr 2003 18:18:41 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Simon Josefsson <address@hidden> writes:

> Kenichi Handa <address@hidden> writes:
>>  Unfortunately, the current Emacs doesn't have a facility to
>>  detect UTF-8 byte sequence.  So, if we put UTF-8 the higher
>>  priority, all files are detected as UTF-8.  :-(

> I see.  Is this very difficult to solve, or why hasn't it?  The
> algorithm to detect UTF-8 is not that complicated.

Ooops, I'm very sorry that I was wrong.  The current Emacs
contains a builtin utf-8 and utf-16 (with BOM) detectors.
So, putting UTF-8 the higher priority should have no
problem.

Richard Stallman <address@hidden> writes:
>     It seems binary is preferred over utf-8 and utf-16-* in
>     coding-category-list.  This seems extremely conservative.  I guess it
>     means UTF-8 can never be autodetected by default?

> That certainly seems undesirable.  Unless there is a specific reason
> why it needs to be this way, I agree with you that we should raise
> the priority of utf-8 and utf-16.

We can raise the priority of utf-16-le-with-signature and
utf-16-be-with-signature, but can't raise the priority of
utf-16-le, utf-16-be, utf-16 because it's impossible to
distinguish them from binary data.

So, I've just installed these changes.


2003-04-28  Kenichi Handa  <address@hidden>

        * international/mule-cmds.el (reset-language-environment): Raise
        the priority of mule-utf-8, mule-utf-16-be-with-signature and
        mule-utf-16-le.-with-signature.

        * international/mule-conf.el: Set coding-category-utf-16-be to
        mule-utf-16-be-with-signature, coding-category-utf-16-le to
        mule-utf-16-le-with-signature.  Raise the priority of
        coding-category-utf-8, coding-category-utf-16-be, and
        coding-category-utf-16-le

---
Ken'ichi HANDA
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]