[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: converting between charsets
From: |
Alexander Kotelnikov |
Subject: |
Re: converting between charsets |
Date: |
Tue, 09 May 2006 09:41:08 +0400 |
User-agent: |
Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux) |
>>>>> On Mon, 08 May 2006 10:30:48 -0400
>>>>> "SM" == Stefan Monnier <address@hidden> wrote:
SM>
>> Let's first talk about encoding regions. Why does not it work with
>> encode-coding-region?
SM>
SM> It works. Any evidence that it doesn't?
I started this thread from note about problems with
encoding-coding-region:
>>>>> On Sun, 07 May 2006 13:52:08 +0400
>>>>> "AK" == Alexander Kotelnikov <address@hidden> wrote:
AK>
AK> There could be three different ways, which I checked, how characters
AK> to be converted can appear in emacs buffer:
AK> a. when I open such file.
AK> b. when I type in characters and my keyboard layout in X is different
AK> from 'us', for me it is normally 'ru' then.
AK> c. when I type in after I used toggle-input-method.
AK>
AK>
AK> And the trouble is that encode-coding-region converts only in case
AK> (c). In (a) and (b) characters that need conversion are substituted
AK> with question marks. And even in (c) conversion is performed (if, for
AK> instance, I save a file after it appears to be in koi8-r) in the
AK> converted buffer converted characters are shown in \321 manner.
AK>
AK> So, it will be nice to get some help on this, thanks.
>>>> 1. Paste into Emacs frame works strange:
SM> What text did you paste? Where does it come from?
>> I type some Russian text in xterm and paste in into Emacs, have a look
>> at the attached screenshot.
SM>
SM> Oh, I see. I don't know enough of how this works to help you much further.
SM> If you hit C-u C-x = on the various chars (especially on two similar chars
SM> displayed with different fonts), you'll see that they come from different
SM> charsets (one is probably something like iso-8859-5 and the other may be
SM> unicode). Emacs-22 doesn't unify them by default. You can try to put
SM> (unify-8859-on-decoding-mode 1) in your .emacs. And you can also try to
SM> play with utf-fragment-on-decoding. And ask someone more knowledgeable
SM> about such problems.
On first character like latin T:
character: <I removed cyrillic character> (01212102, 332866, 0x51442)-A
charset: mule-unicode-0100-24ff
(Unicode characters of the range U+0100..U+24FF.)
code point: 40 66
syntax: word
category: y:Cyrillic
buffer code: 0x9C 0xF4 0xA8 0xC2
file code: 0xD0 0xA2 (encoded by coding system mule-utf-8)
font: -monotype-courier new-medium-r-normal--13-94-99-99-m-80-iso10646-1
After the same character in the next line:
character: <I remove cyrillic character shown with wrong fontt> (0151664,
54196, 0xd3b4)
charset: japanese-jisx0208 (JISX0208.1983/1990 Japanese Kanji: ISO-IR-87)
code point: 39 52
syntax: word
category: Y:Cyrillic characters of 2-byte character sets j:Japanese
|:While filling, we can break a line at this character.
buffer code: 0x92 0xA7 0xB4
file code: not encodable by coding system mule-utf-8
font: -Misc-Fixed-Medium-R-Normal--14-130-75-75-C-140-JISX0208.1983-0
Something is not ok here...
SM> You could even M-x report-emacs-bug about it, since maybe the default config
SM> in a cyrillic locale should already take care of it.
SM>
>>>> Cyrillic nput in emacs -nw in xterm still does not work, if I just
>>>> change X keyboard layout.
SM>
SM> That doesn't give us much to go on, does it? What does it do, other than
SM> "not work"?
SM>
>> It beeps.
SM>
SM> What does C-h l show after hitting a particular key?
M-P M-0 C-h l
--
Alexander Kotelnikov
Saint-Petersburg, Russia
- converting between charsets, Alexander Kotelnikov, 2006/05/07
- Re: converting between charsets, Stefan Monnier, 2006/05/07
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/07
- Re: converting between charsets, Stefan Monnier, 2006/05/07
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/08
- Re: converting between charsets, Stefan Monnier, 2006/05/08
- Re: converting between charsets,
Alexander Kotelnikov <=
- Re: converting between charsets, Stefan Monnier, 2006/05/09
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/13
- Re: converting between charsets, Stefan Monnier, 2006/05/13
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/14
- Re: converting between charsets, Stefan Monnier, 2006/05/14
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/15
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/15
- Re: converting between charsets, Stefan Monnier, 2006/05/15
- Re: converting between charsets, Alexander Kotelnikov, 2006/05/15
- Re: converting between charsets, Stefan Monnier, 2006/05/15