bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#7410: Impossible multibyte->unibyte conversion


From: Stefan Monnier
Subject: bug#7410: Impossible multibyte->unibyte conversion
Date: Mon, 15 Nov 2010 16:46:53 -0500

Package: Emacs
Version: 24.0.50

I get incorrect treatment of accents in gnus-article-wash-html in
the trunk.  More specifically, accents from latin-1 HTML email get
turned into \NNN byte chars.

With extra checks, I get that the accented chars are properly decoded into
the *mm*<4> buffer, and then in mm-shr, we do

       (mm-with-part handle
         (when (and charset
                    (setq charset (mm-charset-to-coding-system charset))
                    (not (eq charset 'ascii)))
           (insert (prog1
                       (mm-decode-coding-string (buffer-string) charset)
                     (erase-buffer)
                     (mm-enable-multibyte))))
         (libxml-parse-html-region (point-min) (point-max)))

where mm-part inserts the `handle' part into a unibyte temp buffer, thus
turning those latin-1 accents back into bytes (well, in my own branch
of Emacs this signals an error instead, which is how I caught it).

It looks like mm-handle-buffer does not consistently return bytes (as
it usually does) but also occasionally returns chars.
Such inconsistencies will hurt until we get rid of them.


        Stefan

         


In GNU Emacs 24.0.50.1 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2010-11-04 on ceviche
Windowing system distributor `The X.Org Foundation', version 11.0.10707000
configured using `configure  'CFLAGS=-Wall -Wno-pointer-sign 
-DUSE_LISP_UNION_TYPE -DSYNC_INPUT -DENABLE_CHECKING -DXASSERTS -DFONTSET_DEBUG 
-g -O1 -I/usr/include/GNUstep' '--enable-maintainer-mode' 
'--with-x-toolkit=lucid''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: fr_CH.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Article

Minor modes in effect:
  diff-auto-refine-mode: t
  electric-pair-mode: t
  electric-indent-mode: t
  url-handler-mode: t
  global-reveal-mode: t
  reveal-mode: t
  auto-insert-mode: t
  savehist-mode: t
  minibuffer-electric-default-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
<select-window> <switch-frame> <select-window> <switch-frame> 
<select-window> <switch-frame> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <switch-frame> 
<select-window> <switch-frame> <switch-frame> <switch-frame> 
<select-window> <switch-frame> e ( p o p t - o - b 
u f f e r <backspace> <backspace> <backspace> <backspace> 
<backspace> <backspace> <backspace> <backspace> <backspace> 
<backspace> - t o - b u f f e r SPC " SPC * m m * < 
4 > > C-e <left> <left> <backspace> <return> M-< <switch-frame> 
<select-window> <switch-frame> <select-window> <switch-frame> 
<select-window> <switch-frame> <select-window> <switch-frame> 
<select-window> <switch-frame> <select-window> <switch-frame> 
<switch-frame> <help-echo> <switch-frame> <select-window> 
<switch-frame> <help-echo> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <switch-frame> <select-window> <switch-frame> 
<switch-frame> <select-window> <switch-frame> <help-echo> 
<switch-frame> <select-window> <down-mouse-1> <mouse-1> 
<C-tab> C-s C-w C-w C-a <switch-frame> <help-echo> 
<down-mouse-2> <mouse-2> <switch-frame> <select-window> 
<switch-frame> <select-window> C-e C-c @ C-a <right> 
<down> <left> <right> <down> <left> <right> <down> 
<left> <right> <down> <left> <right> <up> <left> <right> 
<up> <left> <right> <down> <left> <right> <down> <down> 
<left> <right> <down> <left> <left> <left> <left> <right> 
<right> <right> <right> <left> <right> <up> <left> 
<right> <up> <left> <right> <down> <left> <right> <down> 
<left> <right> <down> <left> <right> <down> <left> 
<right> <down> <left> <right> <up> <left> <right> <up> 
<left> <right> <up> <left> <right> <up> <left> <right> 
<up> <left> <right> <switch-frame> <select-window> 
<switch-frame> <switch-frame> <help-echo> <switch-frame> 
<switch-frame> <switch-frame> <switch-frame> <help-echo> 
<switch-frame> <switch-frame> <select-window> <switch-frame> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <switch-frame> 
<select-window> <switch-frame> <switch-frame> <select-window> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<switch-frame> <switch-frame> <help-echo> <switch-frame> 
<select-window> <switch-frame> <select-window> <switch-frame> 
<select-window> <switch-frame> <switch-frame> <switch-frame> 
<select-window> <switch-frame> <select-window> <switch-frame> 
<select-window> <switch-frame> <switch-frame> <help-echo> 
<switch-frame> <switch-frame> <select-window> <switch-frame> 
<select-window> <switch-frame> <select-window> <switch-frame> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<help-echo> <switch-frame> <select-window> <switch-frame> 
<switch-frame> <select-window> <switch-frame> <select-window> 
<help-echo> <switch-frame> <select-window> <select-window> 
M-x r e p o <tab> r <tab> <return>

Recent messages:
Mark saved where search started
mm-shr
Mark saved where search started [3 times]
Mark set
mm-shr
Entering debugger...
#<buffer  *mm*<4>>
Mark set
Mark saved where search started
Making completion list...

Load-path shadows:
/usr/share/emacs23/site-lisp/bbdb/bbdb-migrate hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-migrate
/usr/share/emacs23/site-lisp/bbdb/bbdb hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb
/usr/share/emacs23/site-lisp/bbdb/bbdb-rmail hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-rmail
/usr/share/emacs23/site-lisp/bbdb/bbdb-gnus hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-gnus
/usr/share/emacs23/site-lisp/bbdb/bbdb-w3 hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-w3
/usr/share/emacs23/site-lisp/bbdb/bbdb-com hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-com
/usr/share/emacs23/site-lisp/bbdb/bbdb-merge hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-merge
/usr/share/emacs23/site-lisp/bbdb/bbdb-ftp hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-ftp
/usr/share/emacs23/site-lisp/bbdb/bbdb-sc hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-sc
/usr/share/emacs23/site-lisp/bbdb/bbdb-vm hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-vm
/usr/share/emacs23/site-lisp/bbdb/bbdb-gui hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-gui
/usr/share/emacs23/site-lisp/bbdb/bbdb-print hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-print
/usr/share/emacs23/site-lisp/bbdb/bbdb-hooks hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-hooks
/usr/share/emacs23/site-lisp/bbdb/bbdb-mhe hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-mhe
/usr/share/emacs23/site-lisp/bbdb/bbdb-whois hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-whois
/usr/share/emacs23/site-lisp/bbdb/bbdb-snarf hides 
/usr/share/emacs/site-lisp/bbdb/lisp/bbdb-snarf

Features:
(emacsbug gnus-topic cl-specs shr url-http url-auth url-gw footnote
xscheme warnings trace testcover scheme unsafep re-builder shadow
inf-lisp ielm comint ring elp edebug cust-print vc-bzr filecache
find-func dabbrev multi-isearch diff-mode jka-compr rect pp descr-text
gnus-fun skeleton canlock sha1 hex-util novice woman tutorial help-macro
man assoc info-look info help-at-pt ehelp apropos cus-edit cus-start
cus-load gnus-html browse-url xml url-cache mm-url url url-proxy
url-privacy url-expand url-methods url-history url-cookie url-util
supercite regi flow-fill executable copyright debug gnus-draft gnus-dup
mule-util sort smiley ansi-color gnus-cite mail-extr gnus-async
gnus-bcklg qp byte-opt bytecomp byte-compile gnus-ml disp-table nnfolder
utf-7 nnimap parse-time tls utf7 nndraft nnmh nnagent nnml gnus-agent
gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art mm-uu
mml2015 epg-config mm-view smime password-cache dig mailcap nntp
gnus-cache gnus-sum nnoo gnus-group time-date gnus-undo nnmail
mail-source format-spec server gnus-start gnus-spec gnus-int gnus-range
message sendmail rfc822 mml mml-sec mm-decode mm-bodies mm-encode
mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mailabbrev gmm-utils
mailheader gnus-win gnus gnus-ems nnheader mail-utils wid-edit noutline
outline easy-mmode flyspell ispell eldoc checkdoc regexp-opt thingatpt
help-mode easymenu view prog-mode electric url-handlers url-parse
auth-source netrc gnus-util url-vars mm-util mail-prsvr reveal
autoinsert uniquify advice help-fns advice-preload savehist
minibuf-eldef cl cl-loaddefs proof-site proof-autoloads pg-vars
bbdb-autoloads agda2 tooltip ediff-hook vc-hooks lisp-float-type mwheel
x-win x-dnd tool-bar dnd fontset image fringe lisp-mode register page
newcomment menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax font-core frame cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs
button faces cus-face files text-properties overlay md5 base64 format
env code-pages mule custom widget hashtable-print-readable backquote
make-network-process dbusbind dynamic-setting system-font-setting
font-render-setting x-toolkit x multi-tty emacs)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]