Layman-"Hackers" Driving Up Searches for Beef Jerky

Mon Apr 15 22:21:12 EST 2010


A foreword is in order here: To all those people, including acquaintances, who have thoroughly enjoyed the living hell out of ClubBing prizes, I apologize if this causes you stress and discomfort. Fear not though, as I doubt it'll bring about any substantial change, and I'm certain you'll continue receiving your "Bing" branded crud in the mail on a weekly basis. Unfortunately for me, I never got around to trying to build a "Bing" prize room, but I can assure you that I've lived vicariously through watching your bots run in the background every minute of every day. But I digress...

Microsoft's Bing search engine has been steadily gaining ground since its release last May, according to various media and blog reports. Some articles cite a gain in market share by some 70% of the inherited market share of Microsoft Live, Bing's predecessor. While it doesn't appear to be taking share away from Google, it does seem to be chipping away at all other ancillary search engines on its way up the rankings ladder. Microsoft has invested some 100 million dollars in marketing for Bing, and it seems to be more-or-less working.

There are, however, some other factors that might be boosting Bing's search popularity. One of them is a bribery scheme called "Bing Cashback". It's essentially "Google Product Search" (looks almost identical to it too), only you get a percentage of cash back by purchasing items through it. Nothing terribly revolutionary there, but it may be helping their rankings.

A much more fascinating topic from a security perspective is their Club Bing project. Club Bing is basically a game website run by Microsoft, where players earn "Bing tickets" for playing games. These tokens can be redeemed for prizes ranging from a song download to KitchenAid mixers. Club Bing has a variety of games to play, but interestingly, three are far-and-away the most popular; currently those are Chicktionary, Spelling Bee, and Word Slugger. All three are anagram puzzle games, and are actually some of the least entertaining games on the site. For each anagram, you have to find 40 words, and there is no time limit. Every anagram solved is worth 20 'tokens'.

If they are the least fun, why are they so popular? While we don't know for sure, we believe the answer boils down to robots. Yes...robots. There are publicly available scripts/bots/hacks, privately owned bots, bots that are essentially screen-scrapers, and bots that are decrypting the pseudo-encrypted packets that Club Bing uses, and 'acting' as a client. As many readers are keenly aware of, since the dawn of computer gaming 'botting' and/or 'scripting' communities have thrived. I recall scripting the living piss out of Gemstone III back in the day, then doing the same on DragonRealms. Text-based MMORPG's were rife with this sort of automated gaming, and the popularity hasn't wavered a bit since then, though the complexity has certainly increased exponentially. Most games try to discourage it or outright make strides at banning that sort of activity. Obviously, like any type of 'hacking', game 'hacking' is a constant cat and mouse game, and game designers can rarely keep the "bad guys" from doing bad things... but they do try. I believe that most end up pretty successful by deterring the casual gamers, and make it as difficult as possible for the code-minded gamers to get away with cheating.

In the case of Club Bing it seems somewhat different. Abuse was initially rampant, and Microsoft implemented a few checks, namely:

  1. they capped the number of tickets obtainable through any one game; somewhere around 34,000 per account.
  2. they implemented an occasional captcha (every 4 games).
  3. they implemented an IRS limit, meaning that if you earn more than $600 worth of prizes, you have to enter in tax information.
  4. they capped prize distribution to only one of any item per household (cant order two Xbox's to one address, for instance, but can order an xbox, a bar of soap, and a rack of lamb no problem).
  5. a given account is capped at 1,000 tickets per day.
  6. they've implemented timers of some sort, rate-limit caps.

These may seem like reasonable botting controls, but they deserve a little more analysis. I'll address the above in coordinating line items below:

  1. accounts are free and easy to create. One individual can, and often does, create more than a single account. Since there are three games that are commonly botted, the cap is effectively 100,000 tickets.
  2. the captcha was easily conquered by available robots. Microsoft then deployed a graphical captcha, and now exploiters are required to manually enter in a captcha every 20 minutes or so.
  3. again, multiple accounts. Some people on forums are boasting that they have 30 accounts -- and I believe them.
  4. this is significant, though they continue to offer new prizes, making it less of a deterrent. People also get around this by shipping stuff to grandma on another account.
  5. again, a reasonable step, but I'll need more than a line to describe the problem with this particular item.
  6. the bot writers have figured out these rate limiters and have adjusted accordingly. Even so, the rate limits seem extremely generous. I haven't been able to trip one.

First, I should put these tickets into perspective. If you were to manually do one of these anagrams, without cheating via an anagram solver, it could take you a long time to get through a single game to get your 20 point reward. In fact, without cheating, you may not even be able to complete a single game unless you have memorized dictionary words containing between 3 and 7 characters. I failed miserably, gave up after about 30 minutes and pulled out an anagram solver that helped me finish the puzzle pretty quickly (~10 minutes or so). The point here is that 1,000 tokens is not a trivial accomplishment in one day, particularly for those of us with jobs, families, and madz n00b ganking to do after the kids go to bed. Which brings us to the effective 100,000 limit on anagram games. I would suspect that would take someone literally a full year, maybe more, to accomplish manually if they were really that driven to get the free beef jerky making dehydrator + an Xbox 360.

With the bot, however, the time to solve an anagram and earn 20 tickets is roughly 4 minutes (about 3.6 minutes to be exact.) So 1,000 tickets is obtainable in roughly 180 minutes. No human could come close to this efficiency. It would take about 100 days of play to max out a single account via automation, and who-knows how long to max an account out manually. Many botters are running multiple accounts. The 1-per-household limitation on prizes is being abused as well by having stuff shipped to friends and relatives. Bing goods are also actually being used in black market deals: we've heard of people exchanging SSD drives for bars of minty soap... yes, bars of minty soap. Makes you wonder where the SSD's came from, doesn't it? Hence "black market". There are people in forums discussing how to take the video games and other goods obtained from Club Bing, remove the Bing stickers, and return them to certain retailers for cash. People are saturating the E-bay market as well with this stuff, driving down the price of just about anything that shows up on Club Bing (like Xbox controllers).

So, obviously Microsoft has an incentive to stop this abuse, right? I mean, they are doling out thousands of dollars worth of merchandise every day to botters. Look at those three games as aforementioned on and see how many people are playing them compared to the other games on ClubBing. The number of players in games with publicly available automation is ALWAYS several times more than the number of players in games without publicly available automation. See the chart:

Clearly there's incentive for Microsoft to stop it... and it seems simple enough to put and end to it: throttle the tickets per hour to something more resembling of an actual human being, or tokens per week, or whatever. Yes -- the bot writers will circumvent anything -- if there's incentive. So, why not in that case just drop the top botted games? They are the most popular, sure, but artificially so.

But, there's also a conflicting interest at hand. Every single typed word entered into these games generates a Bing search below the game. The results are generally not useful in the game, and that's being generous. Even if the results were relevant and helpful to solve the puzzles, on my screen I can only see the few results unless I actually try to scroll down and see more (actually, when running the bot you can't see the results at all without scrolling down). See screenshot of me playing the game "normally":

The robot makes occasional mistakes, so the number of searches in one 40-word, 20 point game is about 50. For 1,000 tickets (the daily cap), thats about 2500 automated Bing searches, generated in three hours, or roughly 12.5 searches per minute. A fair chunk, but certainly not game-changing in the scheme of things right? Well... maybe not. Consider the following formula:

A = 3 = number of scriptable games
X = average number of players in games without publicly available automation
Y = average number of players in games with publicly available automation.
G = (Y - X) * A = estimated number of automated players (a guess, granted)
12.5 = searches per/minute by automated players
G * 18000.00 = (est. # of automated players) * ((mins./hour * hours/day) * (searches per/min. by automated players))
Z = (G * 18000.00) * 30 = average number of automated searches per month

Make sense? Yes... we realize the whole thing is based on a guess (G), and the assumption that people are eagerly awaiting the captcha every 20 minutes and responding to it near instantly. But lets add it together anyways, and see what we get.

We ran a script that scraped every 10 minutes, and ran it for 24 hours. We're continuing to run it, and we'll follow up on whether it changes the numbers much over the course of a full week. Weekends may be more or less busy (I suspect a lot less busy, I suspect "playing" this stuff at work is where it's at).

The end result was:

A = 3
X = 32
Y = 335
G = 909
G * 18000 = 16,366,849
Z = 491,005,479

To summarize, an estimated average of nearly a half of a billion searches initiated by robots per month. Lets back up a little to the assumptions. There's an assumption that people are right on top of the captcha, and that there is minimal network lag, etc. Considering that, let's apply a modifier, just for kicks, and say that the actual results are 60% of the calculated results. In that scenario, we're still talking about 300 million automated searches.

Putting the numbers in context, in January it was reported that Bing had 1.6 billion searches, accounting for some 10% of the search marketplace. While we can't account for how comScore gets its data, they claim to capture "toolbar, widget, affiliate, cross-channel and other search types to fully account for total search volume", which may or may not count this sort of automated traffic. Assuming that these stats somehow include this traffic, regardless of how accurate some of the assumptions are in the experiment, a sizable chunk of Bing search results are being pumped out by bots.

So what, right? Bing gets good numbers, marketing crud in circulation, and a tax deduction, while botters get schwag, and everyone lives happily ever after. Very true, except that Microsoft shareholders are being told that Bing's market share is rising and the competition is being told they are losing share. Investors might be making decisions based on this, advertisers are potentially being defrauded based on these search statistics, and as such this situation is may not be as victimless as it seems.

To be fair, the Bing searches below the game do not appear to display advertisements, presumably because Microsoft realizes it would throw impression conversion statistics way off (since few humans are likely to be actually looking at those search results, fewer still would click). We scoured SEC filings for any claims of market share by Microsoft as it relates to Bing searches and found nothing. It seems that MS isn't drawing any estimates, potentially because it is well aware of how many searches generated on Bing are bogus. Instead, it seems third parties are the ones touting market share and growth.

In short, if the organizations that produce these market-share statistics are figuring in these searches, or if the number of searches are provided by Microsoft themselves, then some 20-30+ percent of the results are automated (or possibly bogus) and more still are completely useless and not real active "searches". Even if the numbers and calculations above are irrelevant, and indeed are not figured into market-share data, the number of searches produced for little if any benefit to the player of these games is questionable and the number of searches produced by bots is borderline DDOS.

I'll end with an open letter:

Dear MS,

If the searches done automatically by Club Bing games aren't relevant or useful, and you know
that many aren't even seen by humans, then what exactly ARE you doing them for?


Dee Dee Winters

main page ATTRITION feedback