Re: Cyrillic vs UTF-8

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cyrillic vs UTF-8

From:	Kenichi Handa
Subject:	Re: Cyrillic vs UTF-8
Date:	Sat, 26 Apr 2003 17:11:06 +0900 (JST)
User-agent:	SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Simon Josefsson <address@hidden> writes:
> It seems binary is preferred over utf-8 and utf-16-* in
> coding-category-list.  This seems extremely conservative.  I guess it
> means UTF-8 can never be autodetected by default?  Is the unicode
> support so bad it shouldn't even be preferred over binary?  UTF-8 is
> well formed and restricted; detecting it properly (even compared to
> Latin-n) can be done well enough that failures rarely happen in
> practice.

> Can't we move binary down below UTF-8 in CVS?  IMHO we should move
> UTF-8 earlier still, since determining whether data is UTF-8 or not
> can be done with good probability.  Prefering binary over UTF-8 seems
> just wrong.

Unfortunately, the current Emacs doesn't have a facility to
detect UTF-8 byte sequence.  So, if we put UTF-8 the higher
priority, all files are detected as UTF-8.  :-(

> There used to be (in Emacs 21.2) a PROBLEMS entry suggesting what you
> say, but it has been removed both in 21.3 and in CVS.  I thought that
> meant UTF-8 was better supported now, but this doesn't seem to be the
> case.

The UTF-8 support was surely improved but not that much as
you expect.

By the way, all these problems are solved in emacs-unicode.
It's available from CVS server as a branch tag
"emacs-unicode" (see
http://savannah.gnu.org/cvs/?group=emacs).

---
Ken'ichi HANDA
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Cyrillic vs UTF-8, Simon Josefsson, 2003/04/25
- Re: Cyrillic vs UTF-8, Eli Zaretskii, 2003/04/25
  - Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/25
    - Re: Cyrillic vs UTF-8, Eli Zaretskii, 2003/04/25
    - Re: Cyrillic vs UTF-8, Kenichi Handa <=
    - Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/26
    - Re: Cyrillic vs UTF-8, Kenichi Handa, 2003/04/28
    - Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/28
    - Re: Cyrillic vs UTF-8, Benjamin Riefenstahl, 2003/04/26
    - Re: Cyrillic vs UTF-8, Benjamin Riefenstahl, 2003/04/26
    - Re: Cyrillic vs UTF-8, Richard Stallman, 2003/04/28
    - Re: Cyrillic vs UTF-8, Richard Stallman, 2003/04/26
    - Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/26
    - Re: Cyrillic vs UTF-8, Stefan Monnier, 2003/04/28
    - Re: Cyrillic vs UTF-8, Simon Josefsson, 2003/04/28

Prev by Date: Re: Cyrillic vs UTF-8
Next by Date: Re: MML charset tag regression
Previous by thread: Re: Cyrillic vs UTF-8
Next by thread: Re: Cyrillic vs UTF-8
Index(es):
- Date
- Thread