savannah-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Savannah-users] Mailing list discarding random messages


From: Bob Proulx
Subject: Re: [Savannah-users] Mailing list discarding random messages
Date: Sat, 25 May 2013 12:01:21 -0600
User-agent: Mutt/1.5.21 (2010-09-15)

Marin Rameša wrote:
> Bob Proulx wrote:
> > Second is a question.  Are the posters subscribed to the mailing 
> > list?
> 
> Yes.

Then unless the moderate bit is set for them their messages once
arrived would be (should be anyway) sent through Mailman without delay.

> > Or their address added to the whitelist?  Posters who are subscribed
> > or have the address added to the whitelist will have no screening of
> > any kind (other than that on the eggs incoming mx receiving machine).
> 
> So, it's probable that mx machine filters are too restrictive.

Maybe.  I don't know.  We will probably need to involve sysadmin in
order to resolve this problem.

Are the messages very large?  Very large messages will be rejected at
SMTP time.

> > If you tell me the message-id of a message that was discarded I
> > can find it in the logs on lists.gnu.org and determine something
> > about the reason for the deletion, depending upon what is logged
> > there.
> 
> The one that failed in delivery and it's not in the archive is:
> <address@hidden>

I find no record of it in the Mailman logs on lists.gnu.org.  I don't
think it was handled by Mailman.

> The successful (just one of the many) is:
> <address@hidden>

And of course that message is in the post log.

  /var/log/mailman/post:May 24 00:10:53 2013 (22522) post to www-hr-lista from 
address@hidden, size=3227, message-id=<address@hidden>, success

> > Is "lista" a pattern that should be added to that collection?  Or is
> > that a single one-off name unique to that list?
> 
> So, "lista" is Croatian for "list" and www-hr is the name of the 
> translation project - www-hr-lista is, I think, a logical name given to 
> the list created for the translation purposes.

Thanks for educating me.

> I would like to preserve the name, since it's written in the PO files 
> my project produced.

I didn't mean to imply a suggestion to change the name.  I was simply
asking for information purposes.  I wasn't sure if there would be
twenty more names with the same pattern or not.

> It's a name unique to this list.

That's okay.  I would have added it to the list of patterns as a
unique name.  But as seen in the other message a different pattern
match was used that included this naming too.  So all good!

> > I see www-hr-lista in the tracking log.  So at least until
> > recently it was connected to listhelper and that could have been
> > misclassifying non-english messages.  So I assume it was connected
> > to listhelper and that you have just removed it.  That is a good
> > action.
> 
> Yes, I did. I will see what happens next. I did not receive a complaint 
> since that change.

Unfortunately there isn't an easy way at the moment to handle each
language independently.  Most of the mailing lists are English
language lists and so that is the default language that was configured
for the SpamAssassin and the embedded Bayes classifier.  Currently all
of the mailing lists share a single Bayes token database.  This is
generally good because spammers tend to spew messages across all of
the mailing lists and the whole of the parts is greater than the sum
of each individually.

But if 800 mailing lists are English then most non-English message
there will be learned as spam to them.  If 50 lists are non-English
then the Bayes classifier for them will be skewed badly and has no
hope of performing reasonably.

If a single language has one or two mailing lists then there is such a
large amount of spam that the Bayes classifier doesn't get a good
non-spam sample base and eventually learns (incorrectly) that all
email is spam.  So even if individual lists were separated the large
difference in number of spam and non-spam messages is problematic.

And so for non-English language mailing lists we don't have any good
anti-spam "listhelper" for them.  The best I can suggest is to watch
over them manually.  But I know that can be very tedious and need a
lot of effort.  Perhaps we can improve this situation if someone was
interested in working through the problems.

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]