Re: Possible UTF-8 CJK Regressions in Terminal Emulators

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Possible UTF-8 CJK Regressions in Terminal Emulators

From:	Kenichi Handa
Subject:	Re: Possible UTF-8 CJK Regressions in Terminal Emulators
Date:	Wed, 9 Jun 2004 16:37:12 +0900 (JST)
User-agent:	SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <address@hidden>, Dave Love <address@hidden> writes:

> Kenichi Handa <address@hidden> writes:
> >> Absolutely!  Then we can say "utf-8 is (almost) completely
> >> supported"...  I think this is a very important thing.
> >
> > I think "completely" is still too strong even with preceding
> > "(almost)".

> I know what you mean, but I think that's the sort of thing that
> encourages the established user confusion over encoding issues.

> UTF-8 per se is fully supported up to some limit on the code point.
> (I hope that's as large as the Emacs 22 maximum codepoint, but I don't
> remember.)

No, the current support of UTF-8 is limitted to U+10FFFF
(the maximum Unicode character).

> Whether or not valid unicodes can be decoded into a
> character Emacs can actually encode/display/input properly is a
> different matter,

Ah, yes.  In that sense, we can say utf-8 encoding/decoding
is completely supportted.

> and the feature should affect all relevant CCL
> coding systems, especially UTF-16.

As surrogate pair was not handled well by UTF-16 converter,
I've just fixed it too (not yet installed, I'm now adding
comments in a code).  Untranslatable characters are decoded
into UTF-8 form represented by the sequence of
eight-bit-graphic/control characters (the same way as UTF-8
decoding, thus we can use utf-8-post-read-conversion).  The
UTF-16 encoder encodes such a sequence back to the origianl
UTF-16 form.  So, now the UTF-16 support is at the same
level as UTF-8.

---
Ken'ichi HANDA
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/07
- Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Miles Bader, 2004/06/07
  - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/07
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Dave Love, 2004/06/08
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa <=
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Stefan Monnier, 2004/06/09
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/09
- Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Dave Love, 2004/06/08
  - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/09
- Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/11
  - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Juanma Barranquero, 2004/06/12
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/13
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Juanma Barranquero, 2004/06/13
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Andreas Schwab, 2004/06/13
    - Re: Possible UTF-8 CJK Regressions in Terminal Emulators, Kenichi Handa, 2004/06/13

Prev by Date: Re: Possible UTF-8 CJK Regressions in Terminal Emulators
Next by Date: Checkout of Emacs CVS through firewall
Previous by thread: Re: Possible UTF-8 CJK Regressions in Terminal Emulators
Next by thread: Re: Possible UTF-8 CJK Regressions in Terminal Emulators
Index(es):
- Date
- Thread