nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Thoughts: header/address parsing


From: Ken Hornstein
Subject: Re: [Nmh-workers] Thoughts: header/address parsing
Date: Sun, 03 Aug 2014 01:34:26 -0400

I forgot to mention ... thinking about it more, the parser to handle the
headers could probably be written purely in flex.  That alone would simplify
error handling.

>> - but I think just taking the RFC 5322 ABNF and translating
>>   it into a bison grammar shouldn't be too bad.
>
>That alone is worth the effort, I think.

One thing occurs to me.  Right now nmh can handle two forms of email
addresses:

        Real Name <address@hidden>

and:

        address@hidden (Real Name)

The latter is super-obsolete, and has special handling in nmh; it ends
up in the format engine marked as a "note", and there's some special
stuff that deals with that when you're using stuff like %(friendly).
Of course we should parse it, but it would make things a lot easier if
we just treated things in parenthesis as comments (like they're supposed
to be) and just pretend that it doesn't have a real name field.

>> - Parsing a MIME header/param list.  Right now the parser for this
>>   is awful; and I say that as someone who had to add support for
>>   parsing out the RFC 2231 parameter extensions.  I'm not so crazy
>>   about blowing all of that work up, but you know what?  I think it
>>   would just be easier in the long run to deal with it if it was
>>   based on bison.
>
>That one (MIME parser) will take a _lot_ of work.

Looking at it ... the grammer is not huge.  I think it will be easier
than the RFC-5322 address parser.  Yes, there is special handling in
terms of dealing with RCC 2231, but that is relatively straightforward;
it should be easy to recognize those cases in the grammer and handle it.

Of course, noting what Robert Elz said earlier ... the RFC 2045 grammer
has to be interpreted in the context of RFC 822, and spaces are allowed
between any two tokens.  Sigh.

--Ken



reply via email to

[Prev in Thread] Current Thread [Next in Thread]