[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages
From: |
David Levine |
Subject: |
Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages |
Date: |
Thu, 24 Oct 2013 08:52:19 -0400 |
Joel wrote:
> I've noticed recently that I'm getting some mojibake in messages from
> a few sources. Both examples I have handy have a quoted-printable UTF-8
> encoded text/html part, and one also has a quoted-printable UTF-8
> encoded text/plain part.
>
> The one which is HTML only happens also to be in German, and what's
> getting munged are the umlauted vowels: e.g., I'm seeing "für" as
> "für" when I run show on the message. The other message has some curly
> apostrophes in it, so I see "Iââ¬â¢m" instead of "I'm".
>
> I manually decoded the quoted-printable HTML for the message in German
> and the quoted-printable text in the other message, and both appear to
> be correct UTF-8. The locale for my terminal is a UTF-8 locale, and it
> typically displays Unicode code points correctly. This makes it appear
> that the problem is with nmh. (This is with nmh-1.5-3 on Fedora 18.)
>
> Does anyone have an idea where the cause lies? I'd be happy to provide
> the problematic messages, if that would help.
Not off hand. There was a fix to the base64 decoder in June
2012, but it's in nmh 1.5, was for big endian, and shouldn't
affect quoted-printable.
The munged character in your fist example looks like it's
supposed to be c3 bc c3, but instead is 83 c2 bc, if I did
that right. It takes more than one step to get from here to
there, such as losing bits and wrong endian?
Maybe send an troublesome excerpt from the quoted-printable
example?
My first suggestion would be to try the nmh HEAD. It builds
easily and quickly on Fedora.
And I'd try with a profile that has just a Path.
David
- Re: [Nmh-workers] mojibake in UTF-8 encoded quoted-printable messages,
David Levine <=