emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Coding problem with Euro sign


From: Ralf Angeli
Subject: Re: Coding problem with Euro sign
Date: Tue, 13 Dec 2005 23:42:02 +0100
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/22.0.50 (gnu/linux)

* David Hansen (2005-12-13) writes:

> On Tue, 13 Dec 2005 13:12:02 +0100 Ralf Angeli wrote:
>
>> Attached you can find a file with two 8-bit characters I extracted
>> from a file produced by Visual Studio under Windows.  The characters
>> should be u umlaut and the Euro sign.  Emacs does not seem to be able
>> to find the right coding system for it and displays it with
>> raw-text-dos.  I could not get the file displayed correctly by loading
>> it with iso-latin-1, iso-latin-9, or cp1251.  And I am not sure if
>> this is a problem of Emacs or if Visual Studio simply produced
>> garbage.
>
> The \200 (0x80) is EUR in windows-1252.  But i have no clue what
> the 0xc2 is doing there.

The 0xc2 seems to have gotten into the file while sending it.  It is
not there in the original and can be deleted without affecting the
outcome of choosing a coding system.

> With
>
> -*- coding: windows-1252; -*-
>
> in the first line or when you open the file with
>
> C-x RET c windows-1252 RET C-x C-f test.txt RET 
>
> the EUR gets displayed but there's still this 0xc2 (latin capital
> letter A with circumflex in latin-1/9 or windows-1252).

Really?  In my case this results in

,----
|   character: ь (01212154, 332908, 0x5146c, U+044C)
|     charset: [mule-unicode-0100-24ff]
|              (Unicode characters of the range U+0100..U+24FF.)
|  code point: [40 108]
|      syntax: w        which means: word
|    category: y:Cyrillic  
| buffer code: 0x9C 0xF4 0xA8 0xEC
|   file code: 0xFC (encoded by coding system windows-1251-dos)
|     display: by this font (glyph code)
|      -Misc-Fixed-Bold-R-Normal--18-120-100-100-C-90-ISO10646-1 (0x44C)
`----

for the u umlaut and

,----
|   character: Ђ (01212042, 332834, 0x51422, U+0402)
|     charset: [mule-unicode-0100-24ff]
|              (Unicode characters of the range U+0100..U+24FF.)
|  code point: [40 34]
|      syntax: w        which means: word
|    category: y:Cyrillic  
| buffer code: 0x9C 0xF4 0xA8 0xA2
|   file code: 0x80 (encoded by coding system windows-1251-dos)
|     display: by this font (glyph code)
|      -Misc-Fixed-Bold-R-Normal--18-120-100-100-C-90-ISO10646-1 (0x402)
`----

for the Euro sign.

For the record, the following is what I get if I load the file without
specifying a coding system before (i.e. when it ends up being
displayed with raw-text-dos):

,----
|   character: ü (0374, 252, 0xfc)
|     charset: [eight-bit-graphic] (8-bit graphic char (0xA0..0xFF))
|  code point: [252]
|      syntax:          which means: whitespace
| buffer code: 0xFC
|   file code: 0xFC (encoded by coding system raw-text-dos)
|     display: by display table entry [?ü] (see below)
| 
| The display table entry is displayed by these fonts (glyph codes):
| ü: -Bitstream-Terminal-Medium-R-Normal--18-140-100-100-C-110-ISO8859-1 (0xFC)
`----

,----
|   character: € (0200, 128, 0x80, U+0080)
|     charset: [eight-bit-control] (8-bit control code (0x80..0x9F))
|  code point: [128]
|      syntax:          which means: whitespace
| buffer code: 0x80
|   file code: 0x80 (encoded by coding system raw-text-dos)
|     display: by this font (glyph code)
|      -Bitstream-Terminal-Medium-R-Normal--18-140-100-100-C-110-ISO8859-1 
(0x80)
`----

-- 
Ralf





reply via email to

[Prev in Thread] Current Thread [Next in Thread]