emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Need some help with Rmail/mbox


From: Paul Michael Reilly
Subject: Re: Need some help with Rmail/mbox
Date: Fri, 19 Sep 2008 01:35:12 -0400
User-agent: Thunderbird 2.0.0.16 (X11/20080723)

Stephen J. Turnbull wrote:

Thanks for stepping up to this.  Your help is very much appreciated!

All you should need to do for a first cut is to copy the message to a
new buffer, which doesn't need to be narrowed, but might need to have
some Babyl sentinels added.

I first copy the relevant headers to the view buffer by collecting
them from the PMAIL buffer into a string and insert the string into
the view buffer.  I used the rmail.el code pretty much as is but
instead of copying and hiding I do selective copy and insert (ignoring
the case of showing all headers which is trivial).

Then I basically copy the message body into a string and insert it
into the view buffer.  But when I started to work on the decoding it
seemed that decoding the string before inserting it seemed like a good
idea. (Pardon my Elisp rustiness ... is it better to use buffer to
buffer copying than insert string?) I copied the logic for this first
part of decoding from rmail-convert-to-babyl-format.

That's basically it.  You should do the processing on buffers, not
strings, though, and

Are you essentially answering my question above and saying that
copying buffer to buffer is faster/better than operating on strings?


 >        (decode-coding-string body (detect-coding-string body t))

you want to parse the coding from the *header*, not guess on the body.

I do parse out quoted-printable and base64 and apply these to the body
before doing the coding system based decoding.

If you want you can add guessing and/or user-specified MIME charsets
as a user option, but (a) almost all genuine mail today will contain
an appropriate Content-Type charset parameter, and (b) lack of such
(unless all text is US-ASCII) is an extremely strong indicator of
spam.  Few users will need to be able to read messages that have bogus
charset parameters: this feature is not immediately necessary.

OK, makes sense.

The general algorithm should be something like

Identify message in mbox buffer

yup

Copy message to presentation buffer

yup

Identify header and body, add Babyl sentinels if desired

babyl sentinels?  I'm not sure what you mean by this.

Parse headers (specifically content type)

If you had said content type and content encoding I would have said
"yup" and that is what led to my request for help.  Except for the
case of quoted-printable and base64 I'm not sure how to parse those
two headers (Content-Type and Content-Transfer-Encoding) into a coding
system so that I can then do the decoding.  I'm assuming the coding
system guesswork becomes relevant for combinations of the two headers
that Rmail does not grok.  And I now see that there is a strong
relationship between charset and coding system.

Dispatch on content type and subtype:
    Case type is text and subtype is plain
        Identify charset parameter:
            (or charset-from-content-type "us-ascii")
        Map charset to Emacs coding-system
        (decode-coding-region (body-begin) (body-end) coding-system)

OK, this is helpful.  I assume that for all other type/subtype cases
we punt for now and use guessing or just raw text?  But certainly
there are some that we want to process/decode in some fashion,
e.g. text/html or text/xml.  Is there another Emacs package/library
that you are aware of that provides a good model for where we want to
take Rmail so that it handles more type/subtype cases seamlessly in
the view buffer? Even perhaps audio and video (not pure MIME,
i.e. multipart ... yet).

        Wash header for presentation, eg:
            Hide non-displayed header
            Decode RFC 2047-encoded headers

OK, this is helpful but I would add that non-displayed headers do not
need to be in the view buffer at all.  It contains all the headers or
just the displayed headers, depending on the User's current desire.

        Wash body for presentation, eg:
            Highlight and activate url-like substrings
            Highlight quoted material

I don't believe Rmail does either of these operations now.  Is that
your understanding?  If I'm right and this washing is not done, then
it is very high on my priority list to add asap.  If I'm wrong then
please point me where it gets done or how to enable it.

Display buffer in window

yup

Thanks again,

-pmr






reply via email to

[Prev in Thread] Current Thread [Next in Thread]