Putting an END to WordPress Trackback, Comment and Registration Spam
Tags: Perl
I run quite a few WordPress blog sites for myself (you’re reading it), my company and for users who wish to have their own blog on the web.
I keep all of these up-to-date with all of the latest versions of WordPress, the latest plugins and any security fixes or updates. Here are a few examples of blog sites I’ve created with WordPress, using some automated tools I’ve written (in Perl of course):
Diabetes Information Resources
Articles, news, reviews and information for the diabetic or caregivers
Acne Treatment Resources and Living With Acne
Acne treatment, support and skin research for teens and adults
Cancer Treatment Information and Resources
A place for cancer patients and caregivers to go for support
(the latter one needs a better theme, I’ll work on that later)
I have already implemented reCAPTCHA for WordPress, Akismet and Bad Behavior. All three of them work very well together without any issues that I’ve seen.
Akismet takes a collaborative approach to combating spam-like comments in your blog. Any comments which contain a high likelihood of being spam are flagged by Akismet and set aside in the quarantine. You can them go back into there and approve/purge those comments as you see fit. According to this blog’s statistics, Akismet has protected my blog from 17,124 spam comments already.
reCAPTCHA helps prevent automated abuse of your site (such as comment spam or bogus registrations) by using a CAPTCHA to ensure that only humans perform certain actions.
reCAPTCHA is very interesting because it actually benefits the community as a whole. When you enter the words presented, you’re actually helping to digitize printed books, by translating words that were OCRd using software to scan actual printed pages, into digital text, to make meaningful sense out of the scanned items.
OCR isn’t a perfect technology and sometimes it makes mistakes. A blurry ‘e’ might be mistaken for a lowercase ‘s’ for example. Human eyes can discern the difference, and this is what reCAPTCHA does. If you want to learn more, you can read more detail about reCAPTCHA on their website.
But this isn’t enough. Spammers are getting smarter and the volume of spammers is increasing at exponential rates.
The nature of Open Source actually hurts us here, because the same tools we use to prevent and block spam, can be downloaded by the spammers, analyzed and their scripts can be modified to circumvent any of the blocking we attempt. These spammers can download the source for Akismet or reCAPTCHA or WordPress and find holes in it to exploit. And that is exactly what they’re doing.
But that only stops people who are using comment forms and are trying to post comments. What about trackback and registration spam?
First, what are these? How are spammers using these to abuse your system or blog?
Trackback Spam (TrackBack plugins at WordPress)
Trackback spam is a technique where individuals or companies abuse the TrackBack feature of a blog to insert spam links on some blogs. Allowing trackbacks allows spammers to actually add content to your pages (in the form of comments).
If you allow trackbacks on your blog, these links will appear on your blog, and direct spiders and other traffic FROM your popular blog site TO their spam or phishing site. Trackbacks do have a positive use, so you can enable them… if you take precautions to protect them accordingly.
One way to do this with WordPress is to rename wp-trackback.php
to something else that these spammer’s automated scripts will not be able to “guess”.
You’ll also have to change the reference to wp-trackback.php
in the following two files:
wp-includes/template-loader.php wp-includes/comment-template.php
Most of the automated trackback spam tools will just hit several thousand websites at a time by attempting to send a POST request to wp-trackback.php
directly. If you rename it, they won’t find that file on your server, and will get a 404 error. If someone uses the proper comment form on your website, they’ll get the right version of your renamed file.
The other option is to just disable trackbacks altogether. You can find this under Settings → Discussion. Simply uncheck “Allow link notifications from other blogs (pingbacks and trackbacks.)” This can also be accomplished within each post by unchecking the “Allow pings” checkbox when you compose or edit your posts.
Another option is to use a plugin to try to thwart or validate trackbacks. I use one called Simple Trackback Validation. It was a simple drop-in plugin, and appears to work very well.
When a trackback is received on your blog, Simple Trackback Validation will:
- Check to see if the IP of the trackback sender is the same as the IP address of the source the trackback URL is referring to.
This reveals almost every spam trackback (more than 99%) since spammers do use automated bots which are not running on the machine.
- Retrieve the web page at the URL included in the trackback. If the webpage doesn’t a link to your blog, the trackback is considered to be spam. Since most trackback spammers do not set up custom web pages linking to the blogs they attack, this simple test will quickly reveal illegitimate trackbacks.
Also, bloggers can be stopped abusing trackback by sending trackbacks with their blog software or webservices without having a link to the post.
The combination of these three techniques will stop almost every fake, false or fraudulent trackback your blog may receive.
Registration Spam (Registration plugins at WordPress)
The last one is the most challenging, and very-recently, the most abused; Registration Spam.
Registration spam is relatively new, but it allows someone to “bomb” your blog with thousands of fake usernames and registration requests, which your system will then dutifully attempt to send out a confirmation email to the address specified.. which in most cases will be fake, causing your machine to receive a bounce message in return.
Spammers are using GMail and Yahoo addresses to do this right now, so you might see hundreds or thousands of new users attempting to sign up for your blog every week, all of them fake.
I searched around for awhile to try to figure out what tools or plugins I could use to try to stop this. I found something called WP-Ban, but it doesn’t actually seem to work at all.
WP-Ban claims to ban users by IP, IP Range, host name and referer url from visiting your WordPress’s blog. It will display a custom ban message when the banned IP, IP range, host name or referer url tries to visit you blog. You can also exclude certain IPs from being banned.
In my experience with it, it does not work at all.
I looked for something that would make adding a “plain text” name to the signup field a mandatory item. This means that instead of jdoe@gmail.com being signed up, they would have to also enter “John Doe” in the Name field of the signup form. I found nothing that did this for me.
But I did stumble on something called ‘CapCC’ in my travels that HAS helped. CapCC is a small captcha plugin that works with either comments, registration or both. Since I already used reCAPTCHA and it was having a positive effect, I decided to use CapCC for just user registrations. Now the incoming users have to enter a small 5-character string before their registration can be processed.
As a result of this, I now allow anonymous people to post comments (moderated, of course). I don’t have to worry about fake users trying to join, abuses of my MTA or other garbage.
Hopefully others will find this useful.