Archive for October, 2007

Counter-attacking the Botnet Counter-Attack On My Servers


The Botnet Lifecycle As expected, once I started proactively blocking the botnet from my production servers, they decided to launch a counter-attack against me…

But first, let me rewind a bit.

About 2 years ago, I started aggressively looking at my incoming traffic to determine who was hitting me, how frequently, and when. I needed to increase the performance, and reduce the number of misconfigured spiders and rss readers.

This analysis revealed that there were hundreds of thousands of requests happening on a weekly basis for identical content, every single day from these spiders and misconfigured rss aggregators (which happens to be most of them).

Every day, all day.

24×7, 7 days a week. Every week.


Ironically, the bulk of these requests was coming from the .cr domain… Costa Rica. Those requests alone, were more than 50% of my total outgoing bandwidth. They were requesting valid resources, valid files, valid data repeatedly, over and over and over.

So I blocked the entire country using iptables on port 80.

$IPTABLES  -A INPUT -p tcp -m iprange --src-range X.x.0.0-X.x.255.255 -m tcp --dport 80 -j DROP

But there were a lot more coming from China, Russia, Korea, Israel, and other places.

Then I noticed an almost-immediate change in the activity. Now I was being hit with multiple thousands of requests for non-existant content, all trying to hit my Mediawiki pages, Drupal pages, WordPress pages (all written in PHP, if you notice).

What they were trying to do, is generate bogus HTTP_REFERERs for my logs, which would point back to a malicious script that would hijack the machine of the person who clicked on the link from the web-facing statistics. They were also trying to hijack the wiki pages to include these links masqueraded as “valid” links.

I’ve examined quite a few of the malicious scripts (all written in PHP also), and over time, the scripts have changed. They were originally 100% readable, but are now obfuscated and encoded, so they prevent casual dissection. Apparently they looked at THEIR logs, and noticed people were looking at the code, and not just visiting the malicious URL that included that link.

The other thing these forged HTTP_REFERER requests do is cause any log analysis package written in PHP, to parse the code, thus hijacking the server itself. Lovely.

So I started blocking the IPs originating those requests too.

When I did that, I noticed another interesting trend. If I blocked 5 unique, malicious IPs in the first hour, 10 uniques would hit me in the next hour. If I blocked those 10 unique IPs, 50 would hit me in the hour after that. The more I blocked, the faster they started coming in. If I left those 50 unique IPs alone for 24 hours, it would remain constant… 50 unique, malicious IPs/hour, never changing.

If I blocked those 50, then 200 more would come in the next hour. The faster I blocked, the more “the botnet” would send my way.

And then they counter-attacked

Over the last 2 days, after finally blocking over 1,000 unique IPs, they decided to counter-attack, and hit my webservers with http requests which were constructed to intentionally drop the TCP connection, leaving Apache in a CLOSE_WAIT state. I think their attempt was to try to tie up Apache’s listeners so other “valid” users wouldn’t be able to get in.

But I’ve already worked around that with some sysctl and kernel tweaks.

And on top of that, I’ve now automated the blocking, and now instead of blocking them on port 80, I block them on all ports, all protocols, automatically.

# iptables-save | wc -l

That’s 1,392 separate, UNIQUE IPs being blocked now on all ports. That number may continue to grow, but it won’t shrink. The more machines they hijack to try to reach my servers, the more I’ll continue to block.

These amateurs really need to find another hobby, this one is just getting old.

What Is Going On With AdSense Earnings?


“The trouble is that AdSense is a chaotic and complicated system, and we are all trying to make sense of it with a lot of missing information.”

Keep that quote in mind if you’re an AdSense publisher.

AdSense is a complicated system. It is also a black box. What we see through our AdSense reports, is exactly what we’re allowed to see. There’s a lot of complexity behind the system that Google does not want you to see. This is precisely why it is a violation of the AdSense TOS, to share your specific earnings amounts, percentages and such… because collecting that information over time, would allow someone to discover their weighting, their algorithms and their methods.

Read the rest of this entry »

The Ramblings of SEO and SEO 2.0

SEO, Search Engine OptimizationThere’s a lot of fraud happening on the Internet. Everyone already knows this.

But there’s a rising frequency of people advertising themselves as “experts” in the field of SEO, or “Search Engine Optimization“, who are there for no other purpose than to confuse, mislead and trick people into handing over their cash for the hope of having their websites show up higher in the Google rankings.

I’ve had cold-calls at the office number here, from people advertising that they can bring my website up in the rankings. Mind you, the websites I run are already PR6 and PR7 (PR is an SEO acronym for Page Rank). Bringing these websites up from their current ranking, would be VERY difficult for anyone to do. CNN’s website for example, is a PR9 website, and it would take a lot of work to get my sites up another notch or two to PR9.

But there are some true experts out here among the masses. And there are a lot of tools out there that are pretty useful. One of the tools I’ve been using for a couple of years to help my own website development, is the “Auto Keyword Generator” written by Chris Green at WebCreationz.

The tool is hosted in England, and from my location, it got to be a bit slower than I would have preferred. No fault of Chris or anything else, just distance and latency.

So I decided to rewrite it, and add some additional features. I’ve called it the “SEO Auto Keyword Generator“, and after using it to help a few new websites I’ve launched over the last week (with over 1,000 separate, high-quality articles of content, each with their own keyword matrix), I decided it was time to let others use it as well.

Everyone can do SEO, its not magical or voodoo.

The next step, which will be the hardest for most of these fraudsters to accept, is that SEO 2.0, the next wave of SEO, has absolutely nothing to do with keywords, MFA sites, stuffing content, autoblogging, or any of the other garbage whitehat and blackhat tricks they use to try to trick people into clicking on their ads or buying the sham products they’re selling.

“Definition of MFA: Made-for-advertising page. A broad class of generated webpages whose real purpose is to draw visitor traffic in the hopes people will click on the banner and link ads from Yahoo!, Google, or whomever else they partner with. MFA sites rarely have unique, original content, and exist solely to duplicate other content with the intent to make money on other people’s work.”

SEO 2.0 is about the content, about the quality, and about keeping users interested in reading it, and coming back for more.

If you know how to write original, quality content, market and promote it properly, your site will do fine… and that’s what it is all about. My PR6 and PR7 sites have been around for 7 years, with ZERO promotion or “traditional” SEO done to them, and they do fine. One of them has over 11,000 unique, organic backlinks.

It’s all about the content. The quality. Stick with that, and you’ll do fine.

What is Google Smart Pricing? Are you being bitten by it?

As I was looking for the REAL answers behind why my Google AdSense earnings had dropped over 80% in the last month, while my clicks, impressions and CTR tripled… I found this information:

Google smart pricing is a system that will automatically adjust costs of contextual click for Adword advertisers based on a set of values. Google Smart Pricing system designed to help AdWord advertisers to improve their return on investment (ROI). Google does not disclose many information about the values that being used for its smart pricing system. The mechanism of smart pricing system remains largely undisclosed.

The best way to ensure your AdSense site is not affected by smart pricing is to create a great environment for advertisers and your visitors. Creating unique contents for your targeted readers is as important as providing good user experience on your site. If you have more targeted traffic to your AdSense site, then it would be better quality clicks for advertisers.

Google Smart pricing system will use the information it gathers to make price adjustment for an ad. The smart pricing system will reduce the price advertiser pays for any clicks, when there are any clicks generated from its contextual network is less likely to turn into business for advertisers.

Google Smart pricing is designed to help advertisers to improve the effectiveness of their advertising campaign over Google content network. As a result, Google is making less money since the cost to advertisers is reduced in order to provide a strong ROI. Indirectly, smart pricing affects the earnings of AdSense publishers because they are paid lesser.

Some AdSense publishers maintain good quality AdSense sites as well as “less than quality” sites under a single AdSense account. They are facing higher risk of getting their AdSense account smart priced because any of the “less than quality” AdSense site does not perform well could be smart priced easily. The increase in revenue of putting these “less than quality” websites could be less than the loss of revenue due to smart pricing on the entire AdSense account.

Google provide a tool that can track a conversion of a contextual click. Every time an AdSense ad is clicked, Google will place a conversion tracking cookie on a user’s computer. The conversion tracking cookie will be stored in the user’s computer for 30 days. It is used to track a conversion for Advertisers.

The smart pricing system should not only be viewed as a system to protect the interests of AdWord advertisers. It is also a system that indirectly benefits quality AdSense sites in the long term. The smart pricing system could force some AdSense publishers to drop those “low quality” sites. Perhaps with more quality sites, this could boost the advertisers’ confidence to advertise in Google content network. Hence, it will bring more competition for keywords. Eventually, AdSense publishers will benefit from the high paying keywords.

Smart pricing is not just affecting publisher’s earning on one AdSense site, it affects the entire AdSense account. Regardless of the high performance of other AdSense sites across the AdSense account, if one of your AdSense sites is smart priced, all of your AdSense sites will be affected.

AdSense publishers are earning less because of the low eCPM. You could find AdSense publishers revealing their eCPM in a forum when their accounts have been smart priced. They normally have low eCPM ranging from $1.50 to $5. However, low eCPM does not necessary implied that your account has been smart priced. AdSense sites in lower earning niche generally have lower CPM.

Bad Behavior has blocked 102 access attempts in the last 7 days.