Dougal Campbell's geek ramblings

WordPress, web development, and world domination.

Poisoning the well

Overall, the volume of spam attempts on my server have been down lately. Oh, I still get a steady stream, I delete over 100 comment spams (caught by my filters) each day. But I’ve seen fewer of the massive, server-squashing spam runs that hammer my web service with too many simultaneous connections, blocking out legitimate users.

On the other hand, I’m seeing a lot more attempts by spammers to poison the well. What I mean by that is that they are submitting bogus comments, full of non-spammy (but more-or-less random) content, and links to legitimate web sites. For example:

Name: Adam Baumann

Hi. Just letting you know that I enjoyed your site. when Soldier Double Game Lose: , to Bet Opponents you should be very Faithful Big Gnome becomes Industrious Plane in final , Superb Opponents becomes Superb Soldier in final Faithful is feature of White Circle

The comment is obviously gibberish, right? And the links are all to perfectly normal — in fact, popular — sites. You might wonder why a spammer would bother posting it. The idea is to poison the well of any sites which use Bayesian techniques to classify content as spam or not. By tricking sites into classifying “good” content as “spam”, they (theoretically) can reduce the effectiveness of the spam filters.

With enough poisoning, your spam filter may start getting false-positives, which are legitimate messages that have incorrectly been tagged as spam. And if you get enough false-positives, you’ll lose faith in your spam filter and disable it. At least, that’s what the spammers are trying to accomplish.

Will their plan work? I guess that depends on your particular spam filters. I’m betting that systems like Akismet, which collect data from a wide variety of sources, will probably be able to defend against Bayes poisoning. How? Well, there’s this thing called an IP address. Even though the spammers submit their garbage via an army of anonymous proxy servers and zombie machines, they still only have access to a finite number of hosts, a limited number of IP addresses. It won’t take long for those IPs to be statistically classified as sources of spam. An IP like will be flagged as a spam indicator far sooner than the words “Industrious” and “Soldier”.

So once again I say, thank you, spammers. We’re learning more about you every day.

About Dougal Campbell

Dougal is a web developer, and a "Developer Emeritus" for the WordPress platform. When he's not coding PHP, Perl, CSS, JavaScript, or whatnot, he spends time with his wife, three children, a dog, and a cat in their Atlanta area home.
This entry was posted in Blogs, Spam and tagged , , , , . Bookmark the permalink.

16 Responses to Poisoning the well

  1. CT says:

    I’ve been getting a minor flood of the poisoning-the-well variety lately too (in fact, just deleted about a dozen of them out of my moderation queue). Not a big deal so far.

    What I find interesting is that they’re the first wave of comment spam, as opposed to trackback spam, I’ve gotten in a dog’s age. I was under the impression that the proliferation of comment moderation/filtering had done in comment-delivered spam in favor of track/pingbacks. In fact, I shut off my comment auto-close plugin because of this, and sure enough, I didn’t get any spam bits via comments (at least, none that weren’t automatically blocked by my existing blacklist). This might represent a large-scale return to the comment form.

  2. logtar says:

    What works the best for me is keeping my spam word list fresh… I do get an occasional false positive. Spammers are just the bottom feeders, they will not be exterminated any time soon.

  3. Pingback:

  4. Pingback: Solarvoid » We should be allowed to hunt spammers

  5. OMEITOR says:

    I’m getting the same kind of random spam posts, mainly linking to sites like digg, or engadget.

    But then again, I use Akismet, and it’s catching all the spam, and so far no false positives!

    Good luck..

  6. Tom says:

    Ah I’ve been wondering what the hell those sort of comments were.

  7. Rick Beckman says:

    So that’s why I was getting spam pointing to websites I sometimes use! Akismet still caught ’em, however. 🙂

  8. I was also plagued by these style comments and showed how wide the problem is in a post a few weeks ago. My initial solution was to block the offending User-Agents at Apache server level – a less than ideal solution.

    A friend of mine wrote an interesting solution to the problem using secured time-based-tokens and I have incorporated it into this freshly baked SpamKit Plugin for WordPress.

    Maybe it can help?

  9. Pingback: Will’s Blog » Blog Archive » Akismet Rocks

  10. Mohsin says:

    Hey Dear!
    I am new to WordPress, i just started a week ago
    but not sure how to use this blog, when i see my
    blog’s left bar, with links called somthing blogrolls
    and one of them i clicked and now i am here, but here i
    was seeing very bad aspect of online marketing that is Spammers
    but hope you make them lesson!, i am here to as you how to use it
    means my blog i am also not familiar with
    terms used there,,blogroll,ping,rss,,,i hope you will help me in this
    regard,,,,,wish you good luck!, from MOhsin Rasool MoreLee webmaster

  11. Pingback: Blog > Around the web

  12. Pingback: What makes you happy ? » Gatekeeper in.

  13. I remember when I first got caught up in a “poisoning the well” attempt. I was running Spam Karma v1 and some clever spammer began flooding me with spam comments that linked to “.com”. Sure enough, Spam Karma eventually added “.com” to its blacklist, and it wasn’t long before every incoming comment was eaten by Spam Karma. I had to flush my entire Spam Karma blacklist just to get rid of that one false entry. So, if you run an “intelligent” spam filter, such as Spam Karma and Akismet, keep a very close eye on your logs.

  14. Ross Easton says:

    Getting a load of the poison too! Interesting.

  15. Pingback: ShadowLife » Blog Archive » Comment Spam

  16. I just saw your post about this. I wrote one the other day, dubbing the practice “Whitelist Spamming.”

    The real problem comes into play when you think about the big picture. It’s not just about whether spam gets to YOUR blog, it’s about whether your domain gets blacklisted by other blogs and e-mail servers.

Leave a Reply

%d bloggers like this: