savannah-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-help-public] [sr #106304] Bug spam from logged in spammers?


From: Jacob Bachmeyer
Subject: [Savannah-help-public] [sr #106304] Bug spam from logged in spammers?
Date: Fri, 05 Feb 2010 04:16:24 +0000
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.22) Gecko/20090807 MultiZilla/1.8.3.4e SeaMonkey/1.1.17 Mnenhy/0.7.6.0

Follow-up Comment #39, sr #106304 (project administration):

I haven't run across the new spam yet.  Have we attracted more creative
spammers or are they still doing many lines of "<URL> GarbageGarbageGarbage"?

More generally, how difficult would it be to implement a
"hold-for-moderation" mechanism?  Something to allow more aggressive content
filtering without risking losing legitimate comments?  Ideally, it could be
configurable per-project or even per-item-that-takes-user-submissions.  So a
quiet project that isn't getting spammed might be able to just bypass
moderation entirely and let any posts not blocked by site filters (like the
TextCHA) through, while a heavily attacked, highly-visible project might even
be able to make their own filter rules, possibly even on a
per-communication-tool basis, while blocking posts that look too much like
spam entirely.

It could actually be as simple as having means for the system to assign a
non-zero spam score to a new post if certain conditions are met and allowing
to vote posts as "not-spam".  To prevent spammers from gaming the system, keep
a log of votes (to allow tracing spammer accounts that are used to abuse the
voting system) and give each user two votes per time interval (choose day,
week, hour, whatever, ideally based from statistics on how often real users
vote posts as "spam" now)--the first vote in an interval counts in full, but
each subsequent vote counts half of the previous vote.  After an interval,
both votes are restored.  Project admins would need the ability to zero out a
post's spam score (assuming that a project admin can be trusted not to want
their own project spammed) and perhaps tracker admins should be given some
similar power.

Are the spams varying much, or are there correlations between them?  If the
spammers are just posting the same thing over and over, perhaps a circular
buffer (in some sense) of the most recent posts (site-wide) could be kept and
incoming posts checked for similarity against other recent posts, with the
probability that the new post is spam (and thus its initial spam score?) being
proportional to how closely it resembles other recent posts?  This would also
have the advantage that such pseudo-blacklisting would expire as posts are
replaced in the buffer.  Essentially, it would be a time-span-limited
originality filter.  Since legitimate posts to trackers in two different
projects (or even two different bugs in the same project) are likely to be
fairly different, and rather unlikely to contain the same URLs, this could
alternately take the form of a rolling URL blacklist, perhaps with the most
likely to be joe-jobbed (but trustworthy) domains (such as debian.org,
gnu.org, kernel.org, fsf.org, etc.) whitelisted to prevent spammers from
getting (too many) legitimate URLs into the blacklist.  This should rate-limit
spam to a manageable fraction of posts overall.

I like the idea of content-based filters more than CAPTCHAs simply because
spammers have found ways to get other people to solve CAPTCHAs for them (I
believe this is how Gmail's CAPTCHA was "broken"), but if spammers can't post
their garbage over and over, no matter what it is or how many CAPTCHAs they
solve, we can hopefully make spamming Savannah painful enough that the
spammers go elsewhere.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/support/?106304>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]