Clicks of doom

21 May 2006

Spanish antivirus firm PandaLabs dropped a bombshell on Google and Yahoo just before the weekend (covered here initially): announcing that it had uncovered more than 30'000 zombie computers running software that generated fake clicks on pay-per-click adverts. The number looks scary. The story appeared just days after the SANS Institute wrote about a Google-specific botnet.

The PandaLabs figure looks like a whole lot of compromised PCs. But the number by itself does not mean all that much in the world of pay-per-click. A botnet measured in tens of thousands of machines could mean that the sploggers running the botnet are making out like bandits - well, they are bandits - or that is how big a botnet you need to make any money out of click fraud. There is a pretty wide gap between the two. The SANS Institute figures indicate that this is a big network designed to liberate cash fast. Swa Frantzen reported a small botnet of just over 100 machines, each running producing just 15 or so clicks while monitored.

Scott Karp of Publishing 2.0 did some arithmetic on what 30k PCs running a click bot could achieve and the theoretical payout. The top end of the scale ran to more than $6bn in annual pay for those clicks to the fraudsters. And that was for just 100 clicks per day per machine. Bots can easily crank through many times that number. However, because of the way pay-per-click works, I reckon a number closer to 10 per day is probably more realistic. That is because, to get the money, you need to keep your head down as long as you can. Machine-gunning clicks at Google will only draw attention to your activity. But, it's hard to tell. The only people who know for sure are those operating the botnets.

The problem for pay-per-click advertising is that, for a model that is meant to represent the pinnacle of transparency in advertising, there is actually precious little hard information on the scale of click fraud. The search engines don't like to talk about it, except to say that they "have it under control" and offering the odd refund here and there. And the sploggers are hardly going to come out and say: "this is how we do it and this is how much we make".

The open question is: how many clicks from these computers have been recorded as valid clicks and how many were rejected? It's not as easy as it looks, even for the search-engine companies. Clickbot.A seems to make up IP addresses to try to disguise the fact that a lot of clicks are coming from a small base of users. Now, this might be alarming to those who provide the money for pay-per-click advertising. But, as South African specialist Incubeta found out last year, even using different IP addresses to access the same ad will not necessarily be recorded by a search-engine company as a valid click.

Google registered just one click out of ten made for the same ad even where ten different computers were involved. There appear to be mechanisms in place to control how many clicks end up being recorded as chargeable. As Incubeta reported, there are lots of bits of information that come with a HTTP request that could identify duplicates.

If the IP changes on a series of requests closely spaced in time, if the browser version, language, referring page and other information are all the same, then you can pretty much write them all off as dupes. If I were at Google and I wanted to screen out more fraudulent clicks, I would probably also pay attention to where ads were served at any one time. If the referring page does not match up with the ad clicked - bingo - I can be pretty sure I have a fraudulent click. That would make the job of the fraudster tougher - it means having to take note of where ads turn up rather than just banging on known $5-per-click targets with random details.

The search engines have another early warning network: their customers. Advertisers don't care much about clicks; they care about how many users make it to the destination page and what they do after they get there. Smart advertisers monitor this process - called 'conversion' - and get on the blower to Google or Yahoo or whoever to ask why reported clicks, and their charges, have gone through the roof when conversions have gone nowhere. Unfortunately, not enough advertisers check these figures. They may have campaigns that don't last long enough to come up with reliable information. But that does not mean they should be ripped off.

If I were an advertiser, I would expect a lot more information on how the search engines deal with these issues. There are too many unknowns in this environment. If they continue to sweep the problem under the rug, they simply build up trouble for tomorrow. You can put a financial value on 30k bots that send out spam. That we cannot for bots clicking on ads does nothing for confidence in the medium of pay-per-click advertising. These botnets might not actually take money from Adsense or similar programmes. They may simply be rented out to naive webmasters who believe that faking clicks on their own sites will make them rich. It would take them a while, and long after the cheque has been cashed, to find out how the search engines handled their fake traffic. But there is no way of telling - and that does the search engines no good at all. Tens of thousands of bots programmed to click on ads says something is going on and with little other information, it looks damaging.

Personally, I believe that there is no way to measure advertising except through sales and sampled surveys, that the measurability of pay-per-click is mostly illusion. But the search engines would do well to prove me and others wrong, and that means coming up with some harder evidence of the damage caused by botnets and what they are doing about it.