--- Begin Message ---
Subject: |
garbled unicode characters in M-x term |
Date: |
Fri, 19 Sep 2008 21:34:51 +0200 |
User-agent: |
Mozilla-Thunderbird 2.0.0.16 (X11/20080724) |
Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.
Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.
Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:
Problem : Under certain circumstances multibyte characters in M-x term
become garbled and display as single byte escape sequences.
Example : debians aptitude (character U+2592)
From a post I made to gnu.emacs.help:
Ok, I think I found the problem. term uses `binary' as input coding.
After it has examined the input, it inserts the relevant/visible parts
of it into the buffer. Only at this point it decodes the bytes with
the apropriate coding (variable:locale-coding-system).
At some point it splits the input string, to make it suitable for the
with of the `terminal'. The problem is, that it measures bytes not
characters. So the 3-byte character in question in aptitude, which is mostly
on the last column, gets split in 2 strings a 1 and 2 byte. This 2
strings, when encoded and inserted independently, will result in
what was described as the problem.
Solution would be to encode the string before checking the length of
it.
-ap
If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/share/emacs/22.2/etc/DEBUG for instructions.
In GNU Emacs 22.2.1 (i486-pc-linux-gnu, GTK+ Version 2.12.11)
of 2008-07-25 on raven, modified by Debian
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure '--build=i486-linux-gnu' '--host=i486-linux-gnu'
'--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib'
'--localstatedir=/var/lib' '--infodir=/usr/share/info'
'--mandir=/usr/share/man' '--with-pop=yes'
'--enable-locallisppath=/etc/emacs22:/etc/emacs:/usr/local/share/emacs/22.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/22.2/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/22.2/leim'
'--with-x=yes' '--with-x-toolkit=gtk' '--with-toolkit-scroll-bars'
'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -g
-O2' 'LDFLAGS=-g' 'CPPFLAGS=''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8
default-enable-multibyte-characters: t
Major mode: Fundamental
Minor modes in effect:
shell-dirtrack-mode: t
auto-fill-function: do-auto-fill
show-paren-mode: t
savehist-mode: t
icomplete-mode: t
global-hi-lock-mode: t
hi-lock-mode: t
display-time-mode: t
tooltip-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
unify-8859-on-encoding-mode: t
utf-translate-cjk-mode: t
auto-compression-mode: t
column-number-mode: t
line-number-mode: t
Recent input:
C-x C-s M-x d i f f SPC u DEL C-g C-x o M-? m C-M-v
C-x k RET C-x C-g M-x d i f f RET RET t e r m . RET
C-x o C-v C-v C-v C-v C-v M-< M-x w o m a n RET d i
f f RET C-v C-v C-v M-v C-r i g n o r e C-r C-g C-x
b t e r C-s C-s C-g C-x o M-x C-g C-u M-x d i f f RET
RET t e r C-s RET w <return> C-x o C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-x o C-x
o M-< C-x k RET C-x o C-u M-x d i f f RET RET t e r
C-s RET DEL w <return> C-x C-g C-u C-g M-x d i f f
RET RET t e r m . RET C-x o C-v C-v C-v C-v C-v M-v
M-v M-v M-v M-v C-x o C-x C-w ~ / . e m / t e r m .
e l <return> C-x b f o RET C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n M-x r e p o SPC r t RET g r a <backspace>
<backspace> a r b e l e d DEL DEL DEL DEL l e d C-g
Recent messages:
Repeating command 1 other-window
Quit
Repeating command 1 other-window [2 times]
Saving file /home/andy/.emacs.d/term.el...
Wrote /home/andy/.emacs.d/term.el
Making completion list...
Loading emacsbug...done
Quit
--- End Message ---
--- Begin Message ---
Subject: |
Re: garbled unicode characters in M-x term |
Date: |
Wed, 24 Sep 2008 20:07:46 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) |
>>>> Thanks for the analysis. Could you try to write a patch to fix
>>>> this?
>>>>
>>> I did. It's a followup in the thread on emacs.bug .
>>
>> Hmm, I don't see your message. Could you please mail it directly to
>> me?
>
> Shure, here it comes :
The patch looks good. I've installed it into the Emacs CVS trunk, with
a few minor cosmetic changes. Thanks very much for debugging and fixing
this.
--- End Message ---