[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] Thoughts: header/address parsing
From: |
norm |
Subject: |
Re: [Nmh-workers] Thoughts: header/address parsing |
Date: |
Sun, 03 Aug 2014 12:01:01 -0700 |
Ken Hornstein <address@hidden> writes:
>Again, more technical details here.
>
>Address parsing in nmh is kind of a mess. We still support RFC 733 syntax
>"address at host", UUCP stuff, source routing ... a bunch of stuff. This
>should be fixed.
>
>m_getfld() is the handler for generically parsing the headers of an email
>message. Everyone agrees that it pretty much sucks and is overused.
>Thankfully the worst part of it (peeking inside of stdio internals) has
>been fixed; thanks, David!
>
>I've been thinking about biting the bullet and simply writing a header
>parser in flex/bison (I'm assuming flex/bison because those have
>features that make this a lot easier to implement; you don't need
>either to build from a distribution, because Automake keeps around
>the generated C files for the distribution tar file). But practical
>concerns rear their ugly heads again; for one, error recovery is kind of
>complicated. But it occurs to me that maybe I'm trying to bite off more
>than I can chew, and maybe I should try breaking this down a bit. It
>occurs to me that there are really five distinct grammers that we should
>think about:
>
>- Parsing a sequence of message headers. This is really what m_getfld()
>does now. This grammar could be pretty simple. We could use this to
>stuff headers inside of the "new" message API, discussed previously.
>The headers wouldn't be interpreted yet.
>
>- Parsing an address header. This is by far the most complicated part
>of the parser, but I think just taking the RFC 5322 ABNF and translating
>it into a bison grammar shouldn't be too bad.
>
>- Parsing a date header. We have a lex parser that does this now; it occurs
>to me that it should really be a bison grammar, but whatever. Solvable
>problem.
>
>- Parsing a MIME header/param list. Right now the parser for this is awful;
>and I say that as someone who had to add support for parsing out the
>RFC 2231 parameter extensions. I'm not so crazy about blowing all of
>that work up, but you know what? I think it would just be easier
>in the long run to deal with it if it was based on bison.
>
>- Parsing a mhbuild directive. These are kind of like a MIME header, but not
>exactly. The grammer for this is actually pretty weird and picky. Right
>now it's overloaded on the MIME header parser, but it occurs to be that
>there's no reason that should be the case.
>
>The other headers ... well, I guess I don't see a reason why we need to parse
>them. If the message-id header doesn't match the RFC 5322 syntax, should
>we care? I say no.
>
>Modern flex/bison implementations can handle multiple parsers in one
>program, so that's not an issue. This would also let us get rid of the
>horrible fixed buffer sizes we have now.
>
>Thoughts? Completely open to ideas here. I remember people saying that
>they had a list of messages that nmh dealt poorly with; it would be nice
>to try those out against a hypothetically-new nmh parser.
>
>--Ken
I wondering, if in doing this, you might consider a new nmh command that would
parse message headers. I suppose that there a dozens of scripts out that there
do some of this. I'm guessing that they are mostly all ad hoc, and buggy.
Norman Shapiro
- [Nmh-workers] Thoughts: header/address parsing, Ken Hornstein, 2014/08/02
- Re: [Nmh-workers] Thoughts: header/address parsing, Ralph Corderoy, 2014/08/03
- Re: [Nmh-workers] Thoughts: header/address parsing,
norm <=
- Re: [Nmh-workers] Thoughts: header/address parsing, Ken Hornstein, 2014/08/03
- Re: [Nmh-workers] Thoughts: header/address parsing, Lyndon Nerenberg, 2014/08/03
- Re: [Nmh-workers] Thoughts: header/address parsing, norm, 2014/08/04
- Re: [Nmh-workers] Thoughts: header/address parsing, Ken Hornstein, 2014/08/04
- Re: [Nmh-workers] Thoughts: header/address parsing, Ralph Corderoy, 2014/08/05
- Re: [Nmh-workers] Thoughts: header/address parsing, norm, 2014/08/05
- Re: [Nmh-workers] Thoughts: header/address parsing, Jon Fairbairn, 2014/08/06
- Re: [Nmh-workers] Thoughts: header/address parsing, Ralph Corderoy, 2014/08/06
- Re: [Nmh-workers] Thoughts: header/address parsing, Ken Hornstein, 2014/08/06
- Re: [Nmh-workers] Thoughts: header/address parsing, Lyndon Nerenberg, 2014/08/04