[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Is this a good idea?
From: |
Nate Schindler |
Subject: |
RE: Is this a good idea? |
Date: |
Tue, 28 Sep 2004 10:34:00 -0700 |
Here's what we do:
There is a RedHat 7.3 machine at our perimeter running MessageWall 1.0.8,
Sendmail 8.13, SpamAssassin 3.0, SpamAss-Milter 0.2.0, and MySQL 4.0.20.
MessageWall (an SMTP proxy) isn't actively developed anymore. It was intended
as a spam filter originally, but it was cumbersome to configure, and
SpamAssassin is far more comprehensive. However, MessageWall still makes an
awesome SMTP validation tool, and it uses OAV/ClamAV antivirus signatures.
Since the SMTP protocol isn't changing all the time, it's not such an issue
that MessageWall isn't under active development. It does what it does well
enough, and the developer is still happy to answer questions. MessageWall
alone stops a lot of spam and viruses simply because the sending machine
doesn't speak proper SMTP - virus programmers and spammers don't always bother
to follow the RFCs. MessageWall is also a pretty good security layer, because
it's not a very well-known application. Lots of exploits exist for sendwhale.
MessageWall's good about knocking those down.
MessageWall was configured as our only spam filter at one point, but once we
got SpamAssassin, MessageWall was reconfigured to only tag messages for the
most part. Any incoming mail that claims to come FROM our domain, anything
containing an executable or a virus that has a signature in the OAV/ClamAV
definitions, or anything that doesn't follow the RFCs - is rejected. Otherwise
the message is scored based on the old rules we had, and passed along to
Sendmail.
Sendmail is using a fairly generic configuration. There are no local mail
users on that machine, except for a spamtrap account that feeds mail directly
back into 'sa-learn --spam'. Sendmail's main job it to call SpamAss-Milter,
and relay messages between the internet and our Exchange server. BTW, the
spamtrap e-mail address is hidden in our website. I'm sure no human eyes have
ever seen it, but it still gets mail now and then. ;)
SpamAss-Milter is configured to block mail when SA tells it to. It's
configured to pass the mail alias (what's before the @ sign) as the user to
spamc so that custom thresholds can be looked up by spamd.
SpamAssassin is set up to take the alias from SpamAss-Milter, and check it
against a MySQL userpref table to see if there are any custom rules for this
particular user. Our CEO, for example, wants no mail sent to him blocked... so
in the userpref table, he's got a "required_hits" entry of 100. This is
convenient, because SpamAssassin and SpamAss-Milt still properly tag the
message with X-Spam-Level. The CEO has Outlook configured to move messages
with more than 5 stars to a Spam folder in his mailbox. So, for people like
him, it still separates spam from ham. For everybody else, it just rejects the
spam.
I also have custom rules defined for SpamAssassin to read the MessageWall
score, and adjust its own score according to MessageWall's suggestion.
As far as what you were saying about copying spam and ham to separate mailboxes
for learning purposes, the bayes_auto_learn option of SpamAssassin facilitates
this. The only problem with it is that you don't have the original messages
which trained the database.... however the concept of how it works is the same
as you described - exceptionally spammy messages are automatically learned as
spam, and exceptionally hammy messages are learned as ham. The scores used to
make the decision of whether or not SpamAssassin should learn a message are
configurable.
If the message gets this far without being rejected, it's forwarded on to the
Exchange server where Symantec Antivirus has its way with it.
After this, final delivery finally takes place.
In Exchange, I have a couple public folders set up - Spam, and Ham. Users know
that if they receive a false negative, they can copy it to the Spam folder, and
I use it to train the filter periodically. The few users who have higher
thresholds (like the CEO) can copy false positives from the Spam folder in
their mailboxes to the Ham public folder.
Once I had SpamAssassin's Bayes component up and running, people very rarely
copy things to the public Spam/Ham folders.
All of this seems to work quite well for us, and it didn't even take that long
to configure. The only area where the system is seriously lacking is
reporting. I'm using an MRTG script to generate spam vs. ham graphs, and
aw-stats to generate traffic statistics... but there really aren't any *good*
spam-assassin-specific reporting tools that I've come accross yet. But when
you've got a 99%+ success rate, who cares about reports? Logging is enough to
troubleshoot issues. Reports are just eye-candy.
That's my story, and I'm sticking to it. If any portion of this chain of stuff
seems interesting to you, I can show you how it's configured.
Nate
-----Original Message-----
From: address@hidden
[mailto:address@hidden
Behalf Of Thomas Cameron
Sent: Tuesday, September 28, 2004 9:43 AM
To: address@hidden
Subject: Is this a good idea?
Howdy -
My scenario:
SA 3.0.0 and spamass-milter CVS on a Fedora Core box. It's the only host
listed as an MX for the domain, but the customer actually uses a large ISP
for e-mail. We just have the FC2 box relay all mail for the domain to the
ISP. That means that when the customer receives the e-mail it has headers
for the ISP. I'm not sure if that's important for what I want to try:
I am thinking of a system using spamass-milter and SA 3.0.0 whereby
exceptionally spammy (i.e. score over 8 or 10) gets silently copied to a
spam mailbox on the relay server and exceptionally hammy mail (i.e. score of
less than 1) gets silently copied to a ham mailbox. Then sa-learn could be
run against the mailboxes to feed the Bayes database.
Is that a good idea? I kind of think that the downside is that a big
benefit of Bayes is that false negatives can be fed to it so that they are
more likely to be caught next time. In the scenario above I would think
that it's really just reinforcing what is already known.
Past that, I am really dying to figure out a good way to use Bayes on a
relay server. I keep hearing that you need to be able to forward messages
as an attachment but nothing past that. Any comments, suggestions, etc.
gratefully accepted.
--
Thomas Cameron, RHCE, CNE, MCSE, MCT
_______________________________________________
Spamass-milt-list mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/spamass-milt-list