[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [VM] displaying text/html with utf-8 via w3m
From: |
Ralf Fassel |
Subject: |
Re: [VM] displaying text/html with utf-8 via w3m |
Date: |
Thu, 29 Jan 2015 10:57:51 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
* Uday Reddy <address@hidden>
| Ralf Fassel writes:
>
| > The umlaut-a has been replaced by a space. If I run the text manually
| > through w3m, the ouput still contains the utf-8 character, just the
| > HTML-markup is gone. But after inserting in the Presentation buffer the
| > umlaut is changed to a space.
>
| The first thing to check would be whether it is a problem with Emacs. What
| happens if you put the w3m output in a file and visit it in Emacs?
>
| If it is not a problem with Emacs, then it must be a bug in VM. Please file
| a bug report along with a sample message.
Ok, now I found the time to dig into this. It is a mismatch between
encoding written to the w3m process (emacs'
default-process-coding-system) and what w3m is told to expect (-I).
The message itself has
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: Quoted-Printable
Now VM prepares the message for w3m by calling
'vm-mime-display-internal-text/html'
which does
(vm-mime-transfer-decode-region layout start end)
(vm-mime-charset-decode-region charset start end)
So now the region contains utf-8.
Then w3m is called via
'vm-mime-display-internal-w3m-text/html'
which uses 'shell-command-on-region'.
However, shell-command-on-region is documented as
By default, the input (from the current buffer) is encoded using
coding-system specified by `process-coding-system-alist', falling
back to `default-process-coding-system' if no match for COMMAND is
found in `process-coding-system-alist'.
In my setting process-coding-system-alist is nil, but
'default-process-coding-system' is (iso-latin-9-unix . iso-latin-9-unix).
So what is really sent to w3m is latin-9, but w3m is told to process the
input as UTF-8. This replaces the latin-9 non-ASCII chars by " ".
If I temporarily set
(default-process-coding-system '(utf-8 . utf-8))
then the Presentation buffer contains the correct decoded mail message.
IMHO 'vm-mime-display-internal-w3m-text/html' should temporarily adjust
the default-process-coding-system to match what w3m is told to expect.
Sample mail message available on request...
HTH
R'
- Re: [VM] displaying text/html with utf-8 via w3m,
Ralf Fassel <=