[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Ifile-discuss] html tag stripping
From: |
David Bushong |
Subject: |
[Ifile-discuss] html tag stripping |
Date: |
Wed, 25 Jun 2003 11:48:05 -0700 |
User-agent: |
Mutt/1.4i |
Hello, all.
I've noticed an increasing number of spam making it through ifile. Some of
them I can't think of a good answer for (the ones with nothing but a
MIME-attached image and link); these are more the domain of something like
SpamAssassin. However, there's another type that I think ifile could do
a very good job on. Here's a sample from the body of a recent mail:
Yo<kc34sma21py2>uve rea<khuyowp1wuizl>d about them in the
P<ks4nj3w258mkq1>apers....
(If you're reading this list in HTML, try turning it off). Basically, this
completely ruins ifile's effectiveness. However a simple addition to the
word tokenizer to skip anything between matched <>'s would completely avoid
this problem (as well as stop making "font", "color", etc. my most popular
words, spam or otherwise).
Of course this should be optional, but I think it would be a great addition,
as adding a preprocessor to handle this seems like a horrible duplication of
effort as ifile's already tokenizing...
Thoughts?
--David Bushong
- [Ifile-discuss] html tag stripping,
David Bushong <=