[ISN] Ways Google is shaking the security world

Wed May 17 01:45:21 EDT 2006

http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9000540

Sarah D. Scalet
May 16, 2006
CSO

Ask Google anything--what's happening to GE's stock price, how to get
to 881 Seventh Ave. in New York, where Mission Impossible 3 is
showing, whatever happened to Brian W. after he moved away in the
ninth grade--and you'll get an answer. That's the power of this $6
billion search engine sensation, which is so good at what it does that
the company name became a verb.

That kind of power keeps Google on the front page of the news--and
sometimes under unfavorable scrutiny, as demonstrated by Google's
recent clashes with the U.S. Department of Justice and also with
critics displeased by the search giant's stance on Chinese government
censorship.

CSOs and CISOs have a different reason to think carefully about Google
and the implications of having so much information online, instantly
accessible by almost anyone. Although these issues relate to all
search engine companies, Google gets most of the attention--not only
because of its huge share of the Web search market but because of its
unabashed ambitions to catalog everything from images and libraries to
Earth, the moon and Mars.

"We always get enamored of a new technology, and it takes us a while
to understand the price of that technology," says Robert Garigue, vice
president of information integrity and chief security executive of
Bell Canada Enterprises in Montreal. For security pros, the price is
that Google can be used to dig up network vulnerabilities and
locations of sensitive facilities, to enable fraud and cause other
sorts of mayhem against the enterprise. Here, CSO examines the ways
Google is shaking the security world, and what companies can do about
them.

1. Google Hacking (strictly defined)

What it is: Using search engines to find systems vulnerabilities.  
Hackers can use carefully crafted searches to find things like open
ports, overly revealing error messages or even (egads) password files
on a target organization's computer systems. Any search engine can do
this; blame the popularity of the somewhat imprecise phrase "Google
hacking" on Johnny Long. The author of the well-read book Google
Hacking for Penetration Testers, Long hosts a virtual swap meet where
members exchange and rate intricately written Google searches.

How it works: The way Google works is by "crawling" the Web, indexing
everything it finds, caching the index information and using it to
create the answers when someone runs a Web search. Unfortunately,
sometimes organizations set up their systems in a way that allows
Google to index and save a lot more information than they intended. To
look for open ports on CSO's Web servers, for instance, a hacker could
search Google.com for INURL:WWW.CSOONLINE.COM:1, then
INURL:WWW.CSOONLINE.COM:2, and so on, to see if Google has indexed
port 1, port 2 and others. The researcher also might search for
phrases such as "Apache test page" or "error message", which can
reveal configuration details that are like hacker cheat sheets.  
Carefully crafted Google searches sometimes can even unearth links to
sloppily installed surveillance cameras or webcams that are not meant
to be public.

Why it matters: Suppose someone is scanning all your ports. Normally,
this activity would show up in system logs and possibly set off an
intrusion detection system. But search engines like Google have Web
crawlers that are supposed to regularly read and index everything on
your Web servers. (If they didn't, let's face it--no one would ever
visit your website.) By searching those indices instead of the systems
themselves, "you can do penetration testing without actually touching
the victims' sites," points out consultant Nish Bhalla, founder of
Security Compass.

What to do: Beat hackers at their own game: Hold your own Google
hacking party (pizzas optional). Make Google and other search engines
part of your company's routine penetration testing process. Bhalla
recommends having techies focus on two things: which ports are open,
and which error messages are available.

When you find a problem, your first instinct may be to chase Google
off those parts of your property. There is a way to do this--sort
of--by using a commonly agreed-upon protocol called a "robots.txt"  
file. This file, which is placed in the root directory of a website,
contains instructions about files or folders that should not be
indexed by search engines. (For a notoriously long example, view the
White House's file at www.whitehouse.gov/robots.txt.) Many companies
that run search engines heed the instructions in this file.

Notice we said "many"? Some search engines ignore robots.txt requests
and simply index everything anyway. What's more, the robots.txt file
tips off hackers about which public parts of your Web servers you'd
prefer to keep quiet. Meanwhile, the information that your pen testers
found through Google is already out there. Sure, you can contact
search engines individually and ask them, pretty please, to remove the
information from their caches. (Visit www.google.com/webmasters for
instructions.) But you're better off making the information useless.

"The persistence of these caches is impossible to manage, so you have
to assume that if it's there, it's going to be there forever," says Ed
Amoroso, CISO of AT&T. His solution? Simple. "Let's say you found a
file with a bunch of passwords. Change those passwords."

Then, fix the underlying problem. Eliminate or hide information that
shouldn't be publicly available. Long term, you'll have to do the
heavy lifting too, by closing unnecessary ports or fixing poorly
written applications.

Shock waves: 4 (highest). It's up to you to make sure your company
isn't accidentally publishing instructions on how to hack its systems.

2. Google Hacking (loosely defined)

What it is: Using search engines to find intellectual property. It's
Google intel: The researcher uses targeted Web searches to find bits
and pieces of information that, when put together, form a picture of
an organization's strategy. Unlike, say, launching a SQL injection
attack, doing competitive intelligence using public sources is quite
legal (and may in fact be good business).

How it works: The researcher scours the Web for information that might
include research presented at academic conferences, comments made in
chat rooms, résumés or job openings. "Companies leave bread crumb
trails all over the place on the Web," says Leonard Fuld, founder of
Fuld & Co. and author of the forthcoming book The Secret Language of
Competitive Intelligence. One common tactic is using search queries
that reveal only specific file types, such as Microsoft Excel
spreadsheets (filetype:xls), Microsoft Word documents (filetype:doc)  
or Adobe PDFs (filetype:pdf). This kind of search filters out a lot of
noise. Say you want information about General Motors. Searching for
"GENERAL MOTORS" "FINANCIAL ANALYSIS" one day in February yielded
56,400 results. Searching for "GENERAL MOTORS" "FINANCIAL ANALYSIS"  
FILETYPE:XLS brought up only 34 documents. One of those documents was
a spreadsheet from a recruiting agency that contains the current jobs
and work history (though not the names) of executives at numerous
companies (including GM) who may be on the job market.

Another common approach is searching for phrases that may indicate
information that wasn't intended to be public. For this, keywords such
as "personal", "confidential" or "not for distribution" are
invaluable. These targeted searches don't always hit pay dirt, but
they can be fascinating. For instance, on that same day in February,
the top hit on a search for "GENERAL MOTORS" "NOT FOR DISTRIBUTION"  
was a PDF from a credit-rating company with poorly redacted
information that could be easily viewed by pasting the text into
another document. (Oops!)

A final tactic is to target the organization's site itself for
information, such as phone lists, that could be useful for social
engineering scams. Researchers might use the site search function and
look for the phrase "phone list" or "contact list". (An actual search
might be SITE:CSOONLINE.COM "PHONE LIST", and if you run that
particular search, you'll find stories CSO has published about why
your company's phone directory is better kept under wraps.)

Why it matters: "If it's on Google, it's all legal," says Ira Winkler,
information security consultant and author of Spies Among Us.  
Competitive intelligence of this sort is illegal espionage only when
it involves a trade secret--and if something is public enough to
appear in Google, can you really argue that it was protected like a
trade secret?

What to do: That Google hacking party we mentioned earlier should
involve a few site searches for sensitive files, such as financial
records and documents labeled "not for distribution." Beyond your own
borders, it's a good idea to know what people are saying about your
organization, even if there's little you can do about it. "Using
search engines to figure out what your public-facing view looks like
has become a de facto element in any corporate security program,"  
Amoroso says.

Brand protection companies such as MarkMonitor and Cyveillance will
work the beat for you, if you'd prefer. Creating (and enforcing) good
policies about employee blogging or the use of message boards and chat
rooms can also limit your exposure.

Shock waves: 3 (significant). This kind of competitive intelligence
has been going on forever, and it is damaging. The Web means more
information gets out, and it's easier to find.

3. Google Earth

What it is: A software download that provides highly navigable
satellite and aerial photography of the entire globe. (The same images
are also available through Google Maps at http://maps.google.com.) The
scope and resolution of the photos are eye-popping enough that Google
Earth drew ire even as a beta product in 2005. Some people feel
threatened that a photo of, say, their backyard is only a few clicks
away, and others fear that terrorists will use the images of landmarks
or pieces of the critical infrastructure to plot attacks.

How it works: After the user installs the software (the basic version
is free at http://earth.google.com), she can zoom to any spot on the
planet, often with enough detail to see driveways, if not cars. The
virtual globe can be overlaid with information on roads, train tracks,
coffee shops, hotels and more. Enterprising researchers are also
overlaying Google Maps with everything from locations of murders to
public rest rooms that have baby-changing tables. Images are up to
three years old and come from commercial and public sources, with
widely varying resolution.

Why it matters: The privacy implications of having this information so
readily available are certainly worth discussing as a society, but the
security risks to U.S.-based companies are low. Much of the
information was already available anyway. For instance, Microsoft
stitched together images from the U.S. Geological Survey a decade ago
with its Terraserver project It just doesn't work as smoothly.

Not only have these types of images long been available online, but
they can also be easily purchased from government and private sources,
says John Pike, director of the military think tank
Globalsecurity.org. There are only a couple of legal restrictions.  
First, the images must be at least 24 hours old. Second, the U.S.  
military has what Pike calls "shutter control": the ability to tell
commercial satellite companies not to release imagery that might
compromise U.S. military operations. To the best of Pike's knowledge,
the U.S. military has never invoked this power, nor have the
regulations governing satellite imagery changed during the Bush
administration's war on terrorism.

"If Rummy's not worried about it," Pike says, referring to Secretary
of State Donald Rumsfeld, "it's hard for me to see how anyone can lose
much sleep over it."

What to do: If your organization's security plan is based on no one
being able to obtain aerial or satellite photography of a facility,
then it probably ain't much of a plan. "Anybody who has the capacity
to constitute a threat that rises much above graffiti is going to have
it in their power to get imagery of a facility," Pike says. "If
security managers have something that they don't want to be seen, they
need to put a roof on it."

Beyond that, be prepared for cocktail party banter about the risks and
rewards of Google Earth and Google Maps. At the U.S. Food and Drug
Administration, for instance, CISO Kevin Stine finds Google Earth
personally fascinating, and he likes to muse about its potential for
use in, say, disaster planning. "From a CISO perspective, I think we
need to be aware of these kinds of tools," he says. But for his
security group, the only impact he thinks Google Earth might
eventually have, if it begins to encompass more business applications,
is a drain on bandwidth. In other words, it's a concern about as big
as your lawn chairs seen from space.

Shock waves: 1 (minimal). Security by obscurity is so 20th century.  
Google Earth just illustrates why.

4. Click Fraud

What it is: The act of manipulating pay-per-click advertising.  
Perpetrators inflate the number of people who have legitimately
clicked an online ad, either to make money for themselves or to bleed
a competitor's advertising budget.

How it works: With pay-per-click advertising, an advertiser pays each
time someone clicks an ad hosted on a website. Google, Yahoo and other
search engine companies make their money by selling advertisers the
right to have their text-only ads appear when someone searches for a
particular keyword. There are two ways to manipulate pay-per-click
advertising: competitor click fraud and network click fraud.

First, the competitor variety: Let's suppose a company that sells life
insurance wants to advertise on Google. The company might bid for and
win rights to the phrase "life insurance". Then, when someone runs a
Google search for that exact phrase, the company's ad appears next to
the search results as a sponsored link. (How close to the top of the
list depends on both the price per click and the superpowered
algorithms that constitute Google's secret sauce.) Each time someone
clicks the sponsored link, Life Insurance Co. pays the agreed-upon
price to Google -- say $5. With competitor click fraud, an
unscrupulous competitor tries to run up Life Insurance Co.'s
advertising bill by clicking the link. A lot.

Network click fraud, on the other hand, cashes in on the fact that
Google isn't the only company that hosts Google advertising. Suppose
someone has a blog about insurance. She can sign up as a Google
advertising affiliate and have ads for insurance run on her site. If
Life Insurance Co. is paying Google $5 per click, Ms. Insurance
Blogger might pocket $1 for each click her site generates. Network
click fraud is when an affiliate generates fraudulent traffic in order
to boost its revenue.

Google insists it is trying to keep the problem in check. Shuman
Ghosmajumder, product manager for trust and safety at Google, says the
company monitors for all kinds of what it dubs "invalid clicks," and
that it routinely issues refunds to advertisers and closes down
fraudulent affiliates. In 2005, Google even won a lawsuit against an
affiliate it charged with click fraud. But some advertisers say that
Google isn't doing enough to prevent and monitor for fraud because it
profits from the fraud. Google faces a class-action lawsuit led by
AIT, a Web-hosting company, and is in the midst of reaching a $90
million settlement with Lane's Gifts & Collectibles, a mail-order
store. (At press time, the proposed settlement was before a judge.)

Why it matters: Click fraud is following a trajectory that will be
familiar to any CSO, and it's a telling example of how sophisticated
and profitable electronic crime has become. First, the good guys
started looking at server logs to find IP addresses in patterns that
indicated fraud. The bad guys responded by creating automated bots
that simulated different IP addresses and had varying time stamps.  
Then, the good guys improved their click-fraud detection tools, with a
cottage industry sprouting up that specializes in helping online
advertisers monitor for fraud. Queue up "click farms," where the bad
guys hire people in other countries to do the clicking in a way that
looks more realistic. "It's a cat-and-mouse game," says Chris Sherman,
executive editor of SearchEngine-Watch.com.

What to do: The first step is to put tracking measures in place. In a
recent survey done by the Search Engine Marketing Professional
Organization (Sempo), a trade group, 42 percent of respondents said
they had been victims of click fraud, but nearly one-third of
respondents said they weren't actively tracking fraud. "The way you
monitor it is you look for something that doesn't make sense,"  
explains Kevin Lee, chair of the group's research committee. "If you
spent $100 every day last week, and then this week you spent $130
every day and didn't get any more conversions, or whatever your
success metrics are," then you might have a problem, he says.

"Usually the engines will catch the obvious fraud, and they won't even
bill you for it," Lee continues. But if you have a larger problem, you
may need to gather information about why you believe some of the
clicks are fraudulent and ask the company hosting the ads for a
refund. Ghosmajumder says Google devotes significant resources to a
team of investigators who proactively monitor for fraud and also do
research about possible fraud reported by advertisers. Google also has
engineers working on technical means to identify invalid clicks.  
According to the Sempo survey, 78 percent of advertisers that have
been victims of click fraud have received credit from a paid search
provider, and 40 percent of the time it was based on their request.

The question, of course, is whether to bother making a request. Who
better than the CSO to help the advertising department figure out
whether it would cost more for the company to tamp down on the problem
or simply to pay for the fraud?

Shock waves: 2 (moderate). For companies using pay-per-click, this is
one to watch. Click fraud has the potential to dramatically reduce the
effectiveness of online advertising. But with more than 90 percent of
Google's revenue coming from advertising, the company has a serious
incentive to keep the problem in check so that advertisers don't lose
faith in the pay-per-click model.

5. Google Desktop

What it is: A free tool offered by Google that allows users to quickly
search the contents of their hard drives. (Similar tools are offered
by MSN, Yahoo and others.) The latest version can also be used to
share files between computers.

How it works: After the user downloads the tool, it works in the
background to index everything on his hard drive, much like Google
indexes the Web. All fixed drives are indexed by default, but the user
can specify folders to exclude or extra drives to add. The software
can be set to return results on text files, spreadsheets, PDFs, Web
history, e-mail and more. Once the indexing is done, when the user
runs a Google search, items from his own computer appear at the top of
the results. Alternately, he can use the tool by itself by opening it
on his desktop; he doesn't even need to be connected to the Web.

A new version also has a controversial feature that allows a user to
share files between computers. With this setting enabled, Google
indexes the files on one computer, pulls them up on its servers, then
pushes them down onto another computer (which is similarly configured
with the software). Then, a search done on one computer returns
results from both.

Why it matters: It's easy to see why people get all prickly about this
one. Once the tool is installed and files are indexed, a snoop needs
only a coffee break, rather than a lunch hour, to search someone's
hard drive for files about, say, Bob Jones's salary. To make matters
worse, freewheeling users may not pay attention or understand how to
make sure that sensitive documents aren't indexed.

To its credit, Google has tried to improve the standard configuration
of the tool. An early version automatically returned results with
password-protected files and secure HTTP pages; now, those types of
files aren't indexed unless the user changes a setting. "People
screamed about that, and Google changed it very quickly,"  
SearchEngineWatch.com's Sherman says. Even so, setting up appropriate
exclusions can get complicated. Some companies--as well as many
individuals who are concerned about their personal privacy--are also
leery of making so much information available to Google.

The new Search Across Computers feature only heightens these concerns.  
With this feature, Google says, copies of users' personal files can
sit on Google's servers for up to 30 days. Google downplays this time
frame. Says Matthew Glotzbach, product manager for Google Enterprise,
"If both of your computers are on and syncing, [the files are on
Google's servers] only a matter of minutes"--the time it takes for
Google to pull up the information and push it back down onto the
second computer.

But having the information saved on Google's servers at all is
troubling, given that search engine companies are routinely subpoenaed
by prosecutors. (Google's privacy policy states: "We may also share
information with third parties in limited circumstances, including
when complying with legal process, preventing fraud or imminent harm,
and ensuring the security of our network and services.") In one
especially charged case, Google fought a subpoena from the U.S.  
Department of Justice, which wanted search results to help analyze its
enforcement of the Children's Online Privacy Protection Act. A judge
reduced the amount of information Google must turn over, and the
ensuing debate raised awareness about the amount (and nature) of
information that Google has in its stores.

The fact that the software is relatively untested raises additional
questions. Last November, an Israeli researcher reported that he had
found a vulnerability in Microsoft Internet Explorer that allowed him
to illicitly access information in Google Desktop. Google fixed the
problem, but legitimate concerns linger. "Anytime you install software
from a third party directly on a hard drive of a particular machine,
you're potentially opening up holes in the security of that machine,"  
says Matt Brown, a Forrester senior analyst.

What to do: It's time to catch up--something that Brown says is
especially important given the fact that Sarbanes-Oxley requires
companies to keep tabs on where and how long their information is
retained. Consider whether your users actually need desktop search for
their jobs. If they do, you'll want to have a hand in how it's
configured and used. (Bonus points go to the CSO who makes sure that
users understand the privacy implications of all these tools, beyond
just telling them to read the privacy policy.)

At the FDA, Stine is in the early stages of looking at the tool.  
"There have been some requests [for desktop search] here and there,
but there hasn't been a user outcry," he says. If (or when) there
comes a point when a lot of users have a legitimate need for desktop
search, Stine says he'll look carefully at how the technology
identifies, indexes and presents information. "We'd have to ensure
that we still maintain complete control--at least as complete as
possible--over the information," he says.

Fortunately, he'd have plenty of options. Several companies have
enterprise desktop search tools that help CISOs keep tabs on the
information. Google Desktop 3 for Enterprise, currently in beta,
allows administrators to completely disable features such as the
Search Across Computers feature. Google says it is working make future
versions of this tool easier to manage. "I don't think we anticipated
such a concerned or negative response," Glotzbach says. "We've taken
to heart the feedback on the Search Across Computers feature,
especially in the enterprise context, and we're actively working on
making it even easier for the companies to use" in a secure manner, he
says.

X1 Technologies, which has partnered with Yahoo, offers a competing
enterprise search tool that Brown says is more manageable from an IT
perspective. "Part of the problem with these technologies is they get
announced and people immediately start downloading," Brown says. "It
takes companies a little while to catch on to what's happening."

Shock waves: 4 (highest). Desktop search is an untested technology
with a wide potential for misuse. If your users don't need it, don't
let them use it; if they do need it, consider enterprise tools that
can be centrally managed and controlled.

Future Shocks

Google has shaken us, by holding up a mirror and forcing us to look at
what we've put online. "Google provides a lot of capability that can
do you harm as well as providing you search capabilities," Winkler
says. "What makes it its strength makes it its danger."

The future will make search technology only more dangerous. Bell
Canada's Garigue points out that search technology is still in its
very infancy, barely scratching the surface of what he calls the
shallow Web. "The shallow Web is everything that's public on Web
servers," he says. "The deep Web is what's hidden inside databases."  
>From the Library of Congress to Lexis-Nexis' legal and news archives,
to Medline's medical databases, the great bulk of information that
people access online is still available only to subscribers, not to
Google. "Google is the first generation of tools," Garigue says. As
those tools get more sophisticated, the shock waves will only grow
stronger.

This story is reprinted from CSO Online.com, an online resource for 
information executives. Story Copyright CXO Media Inc., 2006.