RE: Is this a good idea?

spamass-milt-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Is this a good idea?

From:	Nate Schindler
Subject:	RE: Is this a good idea?
Date:	Tue, 28 Sep 2004 10:34:00 -0700

Here's what we do:

There is a RedHat 7.3 machine at our perimeter running MessageWall 1.0.8, 
Sendmail 8.13, SpamAssassin 3.0, SpamAss-Milter 0.2.0, and MySQL 4.0.20.

MessageWall (an SMTP proxy) isn't actively developed anymore.  It was intended 
as a spam filter originally, but it was cumbersome to configure, and 
SpamAssassin is far more comprehensive.  However, MessageWall still makes an 
awesome SMTP validation tool, and it uses OAV/ClamAV antivirus signatures.  
Since the SMTP protocol isn't changing all the time, it's not such an issue 
that MessageWall isn't under active development.  It does what it does well 
enough, and the developer is still happy to answer questions.  MessageWall 
alone stops a lot of spam and viruses simply because the sending machine 
doesn't speak proper SMTP - virus programmers and spammers don't always bother 
to follow the RFCs.  MessageWall is also a pretty good security layer, because 
it's not a very well-known application.  Lots of exploits exist for sendwhale.  
MessageWall's good about knocking those down.
MessageWall was configured as our only spam filter at one point, but once we 
got SpamAssassin, MessageWall was reconfigured to only tag messages for the 
most part.  Any incoming mail that claims to come FROM our domain, anything 
containing an executable or a virus that has a signature in the OAV/ClamAV 
definitions, or anything that doesn't follow the RFCs - is rejected.  Otherwise 
the message is scored based on the old rules we had, and passed along to 
Sendmail.

Sendmail is using a fairly generic configuration.  There are no local mail 
users on that machine, except for a spamtrap account that feeds mail directly 
back into 'sa-learn --spam'.  Sendmail's main job it to call SpamAss-Milter, 
and relay messages between the internet and our Exchange server.  BTW, the 
spamtrap e-mail address is hidden in our website.  I'm sure no human eyes have 
ever seen it, but it still gets mail now and then. ;)

SpamAss-Milter is configured to block mail when SA tells it to.  It's 
configured to pass the mail alias (what's before the @ sign) as the user to 
spamc so that custom thresholds can be looked up by spamd.

SpamAssassin is set up to take the alias from SpamAss-Milter, and check it 
against a MySQL userpref table to see if there are any custom rules for this 
particular user.  Our CEO, for example, wants no mail sent to him blocked... so 
in the userpref table, he's got a "required_hits" entry of 100.  This is 
convenient, because SpamAssassin and SpamAss-Milt still properly tag the 
message with X-Spam-Level.  The CEO has Outlook configured to move messages 
with more than 5 stars to a Spam folder in his mailbox.  So, for people like 
him, it still separates spam from ham.  For everybody else, it just rejects the 
spam.
I also have custom rules defined for SpamAssassin to read the MessageWall 
score, and adjust its own score according to MessageWall's suggestion.
As far as what you were saying about copying spam and ham to separate mailboxes 
for learning purposes, the bayes_auto_learn option of SpamAssassin facilitates 
this.  The only problem with it is that you don't have the original messages 
which trained the database.... however the concept of how it works is the same 
as you described - exceptionally spammy messages are automatically learned as 
spam, and exceptionally hammy messages are learned as ham.  The scores used to 
make the decision of whether or not SpamAssassin should learn a message are 
configurable.

If the message gets this far without being rejected, it's forwarded on to the 
Exchange server where Symantec Antivirus has its way with it.

After this, final delivery finally takes place.
In Exchange, I have a couple public folders set up - Spam, and Ham.  Users know 
that if they receive a false negative, they can copy it to the Spam folder, and 
I use it to train the filter periodically.  The few users who have higher 
thresholds (like the CEO) can copy false positives from the Spam folder in 
their mailboxes to the Ham public folder.
Once I had SpamAssassin's Bayes component up and running, people very rarely 
copy things to the public Spam/Ham folders.

All of this seems to work quite well for us, and it didn't even take that long 
to configure.  The only area where the system is seriously lacking is 
reporting.  I'm using an MRTG script to generate spam vs. ham graphs, and 
aw-stats to generate traffic statistics... but there really aren't any *good* 
spam-assassin-specific reporting tools that I've come accross yet.  But when 
you've got a 99%+ success rate, who cares about reports?  Logging is enough to 
troubleshoot issues.  Reports are just eye-candy.

That's my story, and I'm sticking to it.  If any portion of this chain of stuff 
seems interesting to you, I can show you how it's configured.

Nate

-----Original Message-----
From: address@hidden
[mailto:address@hidden
Behalf Of Thomas Cameron
Sent: Tuesday, September 28, 2004 9:43 AM
To: address@hidden
Subject: Is this a good idea?


Howdy -

My scenario:

SA 3.0.0 and spamass-milter CVS on a Fedora Core box.  It's the only host 
listed as an MX for the domain, but the customer actually uses a large ISP 
for e-mail.  We just have the FC2 box relay all mail for the domain to the 
ISP.  That means that when the customer receives the e-mail it has headers 
for the ISP.  I'm not sure if that's important for what I want to try:

I am thinking of a system using spamass-milter and SA 3.0.0 whereby 
exceptionally spammy (i.e. score over 8 or 10) gets silently copied to a 
spam mailbox on the relay server and exceptionally hammy mail (i.e. score of 
less than 1) gets silently copied to a ham mailbox.  Then sa-learn could be 
run against the mailboxes to feed the Bayes database.

Is that a good idea?  I kind of think that the downside is that a big 
benefit of Bayes is that false negatives can be fed to it so that they are 
more likely to be caught next time.  In the scenario above I would think 
that it's really just reinforcing what is already known.

Past that, I am really dying to figure out a good way to use Bayes on a 
relay server.  I keep hearing that you need to be able to forward messages 
as an attachment but nothing past that.  Any comments, suggestions, etc. 
gratefully accepted.
-- 
Thomas Cameron, RHCE, CNE, MCSE, MCT



_______________________________________________
Spamass-milt-list mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/spamass-milt-list

[Prev in Thread]

Current Thread

[Next in Thread]

Is this a good idea?, Thomas Cameron, 2004/09/28
- RE: Is this a good idea?, Nate Schindler <=
  - Re: Is this a good idea?, Thomas Cameron, 2004/09/28
- Re: Is this a good idea?, Nate Schindler, 2004/09/28

Prev by Date: Is this a good idea?
Next by Date: Re: Is this a good idea?
Previous by thread: Is this a good idea?
Next by thread: Re: Is this a good idea?
Index(es):
- Date
- Thread