emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rmail-mbox branch


From: Stephen J. Turnbull
Subject: Re: Rmail-mbox branch
Date: Mon, 08 Sep 2008 13:57:40 +0900

Richard M. Stallman writes:
 >     >     It's the same difference as between text/plain and any of
 >     >     the other media types mentioned: you will cannot preserve
 >     >     all the information in a text/html part while saving it
 >     >     in mbox format.
 >     > 
 >     > Why not?
 > 
 >     I have no clue what you could have in mind that you would ask
 >     that question.
 > 
 > I simply do not see why that claim would be true.
 > I thought you might provide an argument to support it.

Let me try again, this time bluntly.  I cannot determine from your
disyllabic question, nor from the tenfold increased reiteration above,
what claim you think I'm making.  Specifically, I don't know whether

a. you are unaware of the context of your question from my point of
   view and think I'm making a much broader claim than I intend, or

b. you are nearly completely ignorant of the relevant aspects of
   how email works and don't understand the claim, which is obviously
   true to those who work with MIME email, or perhaps

c. both, or even

d. something else.

Now, addressing confusion (a), paraphrased to compress thread context,
I understand the question as "can we achieve interoperability with
other MUAs, yet avoid (MIME-)decoding a message every time we present
it?"  The answer is "only if you're willing to destroy information"
for all MIME types except 'text/plain; charset=us-ascii'.

Addressing lack of knowledge (b), the multimedia functionality common
to Internet MUAs expects to be handed messages in a complex form
defined by RFCs 2821, 2822, 2045-2049, 2231, and several others of
subsidiary interest.  The first two RFCs establish the requirement
that mail shall be transmitted as plain ASCII text.[1]  The rest
provide various conventions for how to serialize quite arbitrary
objects, from non-ASCII text to video clips to binary blobs, as plain
ASCII text.  They are the *only* conventions commonly adopted by MUA
authors as defining non-ASCII-text email.[2]

Therefore, it simply is not possible to save Emacs's presentation of
HTML in *any form other than an HTML MIME body* that can be *expected*
to be operated on usefully by other MUAs.  Except if that form strips
out the HTML-specific formatting information, links, multimedia, etc.,
and turns it into plain ASCII text, which all MUAs must be able to
handle.  Thus, to preserve all information it needs to be left as an
HTML MIME body in the mbox file, and therefore it will need to be
decoded every time it is presented.  It should be clear that the same
rationale applies to any emailed object which is not MIME 'text/plain;
charset=us-ascii'.[3]

I hope that makes it clear that the constraints are on Rmail/mbox are
that you can get the first two but not the third of these desiderata:

1) interoperability with other MUAs (including a future Rmail with
   real MIME capability);

2) preservation of all information in each message in the folder; and

3) saving the decoded form of each message in the folder in lieu of
   the MIME-encoded message.

I advocate abandoning the last, as it's not a perceptible efficiency
gain in the context of Rmail.  Stefan and Paul have both acknowledged
that the runtime cost of decoding is negligible.  It's also safe and
easy to provide a per-message cache in the form of RFC 2822 extension
fields in the message header, with the only constraint being to
respect the ASCII-only restriction of RFC 2822 for header fields.


Footnotes: 
[1]  Indeed there are non-ASCII-text "content transfer encodings", but
they are not considered as reliable, and in my experience they are not
as reliable as the ASCII-based ones.  In general, a private agreement
between mail agents is required for use of the non-ASCII-based ones,
as conformance is not universal.

[2]  With the likely exception of uuencoded file attachments, which I
believe even Microsoft Outlook still recognizes.

[3]  For practical purposes we can assume that message bodies which are
'text/plain; charset=XYZ' for XYZ=iso-8859-1, XYZ=iso-8859-15, and
XYZ=utf-8 will generally "just work", too.  But it's easy to imagine
situations where a MIME-conforming MUA might render such incorrectly
without a MIME Content-Type header to guide it, such as a Latin-2-
using locale.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]