emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MML charset tag regression


From: Simon Josefsson
Subject: Re: MML charset tag regression
Date: Tue, 29 Apr 2003 01:05:12 +0200
User-agent: Gnus/5.09002 (Oort Gnus v0.20) XEmacs/21.4 (Portable Code, linux)

Kenichi Handa <address@hidden> writes:

>> I guess it is the emacs X cut'n'paste code that somehow makes the
>> string into double width japanese characters.
>
> I don't think so.  There's no such code in Emacs that does
> such a conversion.

Emacs behaves different from xterm, gnome-terminal, gedit, etc though.

> I think galeon sends Emacs those cyrillic characters by
> encoding into COMPOUND_TEXT as a charset of JISX0208.
>
> Please try this:
>
> At first, select a cyrillic text on galeon.  Then type this
> in Emacs: C-x RET X raw-text RET C-y.  You'll see something
> like this; "ESC $ ( B ...".

I see ^[$(B'$'`'b'R'Q'i'V'S...

> Next, try this:
>
> At first, select a cyrillic text on galeon.  Then evalute
> this in Emacs:
>    (decode-coding-string (x-get-selecion 'PRIMARY 'UTF8_STRING) 'utf-8)
> I think you'll see single width cyrillic chars (you have to
> have a iso10646-1 font containing cyrillic glyphs).

Yes, this works fine.

> Ideally, the requester should be able to request of the type
> 'TEXT instead of the specific 'COMPOUND_TEXT or
> 'UTF8_STRING, and the requestee should return a text by one
> of these appropriate types that can endocde the text;
> STRING, COMPOUND_TEXT, or UTF8_STRING (in this priority
> order).
>
> But, unfortunetely, many X clients (requestee) don't behaves
> like that.  If 'TEXT is requested, many returns just "?????"
> even if the text can be correctly encoded by COMPOUND_TEXT
> or UTF8_STRING.

Is this a bug in that client?

Or maybe emacs can detect that the TEXT request failed?  Is "?????"
some magic string emacs can test for?  If it could detect this, it
could continue and try to ask for COMPOUND_TEXT or UTF8_STRING.

This isn't the problem I'm seeing though.

> So, it is necessary for Emacs to request by a specific type
> 'COMPOUND_TEXT ('UTF8_STRING has been recently introduced in
> XFree86, and there are many clients that still doesn't
> support it).

What do XFree86 recommend applications to use?  UTF8_STRING with
fallback to COMPOUND_TEXT?  Or TEXT?  Unless there is some well-agreed
on non-controversial recommendation on how internationalized X11
cut'n'paste should work, all attempts to get a complete system working
seems futile.

> Recently, many gtk clients start supporting UTF8_STRING
> without making COMPOUND_TEXT support better.  It may cause
> no problem between gtk clients because they will request
> only the type UTF8_STING.  But, it's a too shortsighted
> manner.  :-(

Ouch.  Some people claim GTK2 support both UTF8_STRING and
COMPOUND_TEXT though
<http://mail.nl.linux.org/linux-utf8/2002-09/msg00115.html>, but
Galeon uses GTK2 and obviously it doesn't produce a good
COMPOUND_TEXT.

> The new encoding method using "Non-Standard Character Set
> Encodings" of COMPOUND_TEXT makes the cyrillic case much
> more complicated.  In some case (perhaps only in KOI8
> locale), X clients recently start to encode cyrillic
> characters in "ESC % / 0 ...".  They don't consider the
> situation that the requester is running in a different
> locale.  :-(

Do you mean the client sends data in a locale-specific charset via
COMPOUND_TEXT?  Ouch.

> Perhaps, we should make Emacs to request UTF8_STRING at
> first if the locale is UTF8, and if that request fails,
> request COMPOUND_TEXT.

This sounds like a good idea to me.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]