[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: An iso-8859-6 cannot be saved
From: |
Peter Dyballa |
Subject: |
Re: An iso-8859-6 cannot be saved |
Date: |
Fri, 22 Sep 2006 01:02:37 +0200 |
Am 21.09.2006 um 04:27 schrieb Kenichi Handa:
In article <address@hidden>, Peter
Dyballa <address@hidden> writes:
My test files starts with: ;;; -*- coding: iso-8859-6; -*-
The mode-line starts with -6:
GNU Emacs 22.0.50 was started with -Q
When I try to save it I get in mini-buffer:
Selected encoding mule-utf-8-unix disagrees with iso-8859-6-unix
specified by file contents. Really save (else edit coding cookies
and try again)? (yes or no)
Then it's saved in UTF-8 and the mode-line changes to -u:. In another
editor (Smultron) I can load the file in ISO 8859-6 encoding and see
that it's original encoding was changed to something like UTF-8 (two
octets when there was only one before).
iso-8859-6 is an Arabic charset. Didn't the buffer contain
a character that can't be encoded by iso-8859-6?
There seem to be even more bugs ...
Those characters that are displayed as boxes are described
incorrectly (oct 244, 254, 255, 273, 277-322). For example:
character: ْ (333618, #o1213462, #x51732, U+0652)
charset: mule-unicode-0100-24ff (Unicode characters of the range
U+0100..U+24FF.)
code point: #x2E #x32
syntax: w which means: word
category: b:Arabic
buffer code: #x9C #xF4 #xAE #xB2
file code: #xD2 (encoded by coding system iso-8859-6-unix)
display: by this font (glyph code)
-B&H-LucidaTypewriter-Medium-R-Normal-Sans-10-100-75-75-M-60-
ISO10646-1 (#x652)
; oct dec hex UCS2 UTF-8
;=====================================
ْ = 322 = 210 = D2 = U+0632 = D8 B2 : ARABIC LETTER ZAIN
Notice that GNU Emacs 22.0.50 says U+0652, which is incorrect. The
correct slot is U+0632.
The range of oct 323-332, 340-362 is displayed as \<the oct value>. C-
u C-x = shows for these, for example:
character: ” (211, #o323, #xd3)
charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF))
code point: #xD3
syntax: which means: whitespace
buffer code: #xD3
file code: not encodable by coding system iso-8859-6-unix
display: by this font (glyph code)
-B&H-LucidaTypewriter-Medium-R-Normal-Sans-10-100-75-75-M-60-
ISO10646-1 (#xFFFD)
Am I assuming the wrong ISO 8859-6 encoding?
--
Greetings
Pete
Without vi there is only GNU Emacs