[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ifile-discuss] Re: html tag stripping
From: |
David Bushong |
Subject: |
Re: [Ifile-discuss] Re: html tag stripping |
Date: |
Sun, 6 Jul 2003 19:51:01 -0700 |
User-agent: |
Mutt/1.4i |
Looks good to me! Until such a beast arrives, I'm trying out a quick perl
preprocessor I wrote using some ideas suggested by various people. I'm not
actually sure (yet) how much better results I'm getting; I'll probably go back
and run it on my old spam to see how much it improves/degrades accuracy, but
just in case you'd like to give it a try, I've attached it. Usage is just
basically :
cat message | ifilepp | ifile -this -that -other
Don't bother reporting bugs; I'm still hacking.
--David Bushong
On Sun, Jul 06, 2003 at 10:21:26PM -0400, Jonadab the Unsightly One wrote:
>
> ...
>
> Perhaps a plugin architecture is in order -- ifile could parse the
> message into sections, each section having a given content-type and
> encoding, and then for each section see if there is a preprocessor
> plugin installed for that encoding (if so use it) and content-type (if
> so, use it) before proceeding.
>
> By "plugin" here I don't mean necessarily a dynamic library; a call to
> an external program could work if the interface were well-defined.
> Frankly, the interface could be as simple as ifile passing the raw
> data on standard input to the preprocessor and using its standard
> output as the decoded/preprocessed content. That might be considered
> inefficient, but it would work, and it would establish a low-bar entry
> level for people writing preprocessor plugins, and the performance hit
> would only be taken when the preprocessors were being used,
> presumably. What preprocessor command (if any) to use for various
> encodings and types of content could just be specified in the ifile
> configuration.
>
> Am I making any sense?
>
>
>
> _______________________________________________
> Ifile-discuss mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/ifile-discuss
ifilepp
Description: Text document