[Infowarrior] - Google adding search privacy protections

Richard Forno rforno at infowarrior.org
Thu Mar 15 01:57:29 UTC 2007


Google adding search privacy protections

By Elinor Mills
http://news.com.com/Google+adding+search+privacy+protections/2100-1038_3-616
7333.html

Story last modified Wed Mar 14 17:08:15 PDT 2007

Google is changing its data retention practices to make it harder to
identify the specific computers used in searches.

Google's servers log information every time someone conducts a Web search,
keeping data such as the keywords used, the Internet Protocol address or
unique number assigned to that person's computer, and information from Web
cookies, which are small bits of data exchanged between a server and a Web
browser each time the browser accesses the server. Cookies are used to
authenticate the user and maintain information such as the user's site
preferences.

Currently, Google maintains the search data logs indefinitely. Under the new
policy announced on Wednesday, which Google expects to have fully
implemented by the end of the year, the company will anonymize the final
eight bits of the IP address and the cookie data after somewhere between 18
months and 24 months, unless legally required to retain the data for longer.
The information on specific searches will remain indefinitely, but it will
be much harder to tie the searches to specific individuals or computers.

"Logs anonymization does not guarantee that the government will not be able
to identify a specific computer or user, but it does add another layer of
privacy protection to our users' data," the company said.

The policy change will apply to future Web search data as well as archived
logs and all copies of the data stored on other servers, Google said. Users
will be able to opt out of the practice and request that their search data
be maintained indefinitely.

Privacy advocates in general said Google's policy change is a step in the
right direction but not nearly enough to really protect Web searchers from
overzealous law enforcers. Keeping the search histories could enable
investigators and governments to get to all sorts of personal information
about people, they argue.

"I don't think the Google proposal is adequate. This period is too long and
it's not in fact data destruction, it's more data de-identification, and
that should be happening in 18 to 24 hours, not months," said Marc
Rotenberg, executive director of the Electronic Privacy Information Center.
"I'm not persuaded that this isn't still a ticking time bomb for Google's
search engine."

Richard M. Smith, an Internet security and privacy consultant at Boston
Software Forensics, said Google should never be archiving the IP address and
cookies on servers. "Google should not be in the spy business," he said. "By
logging IP addresses and search strings they are running the largest
intelligence operation in the world."

Anonymizing the last eight bits of the IP address effectively would enable
investigators to narrow the IP address down to 256 possible computers or
users. That would be similar to obscuring the last digit in someone's street
address.

"For most average consumers that is pretty much anonymous," because many
people connect to the Internet through large companies that dynamically
assign IP addresses, making it even harder to determine exactly which person
conducted a search, said Ari Schwartz, deputy director for the Center for
Democracy and Technology. "It is a risk, but it is better than what we have
today."

Kevin Bankston, staff attorney at the Electronic Frontier Foundation, said
he would like to see Google scrub the entire IP address within six months,
but praised Google for making this "positive first step."

"We hope other online service providers will heed this example and work to
minimize the amount of data they keep about their customers," Bankston said.

Yahoo and Microsoft have declined to disclose their exact data retention
policies with respect to Web searches. AOL saves personally-identifiable
search data for up to 30 days in a way that's visible to the user and uses
an encryption hashing technique to obscure it thereafter, said AOL spokesman
Andrew Weinstein.

"We do not keep any IP addresses in our search database, and we de-identify
any associated account information through an encryption algorithm," he
said. "We have also made a business decision not to keep any unique
identifiers (i.e. the hashed user ID) for longer than 13 months. ..."That
said, it still might contain information of a personal nature, as the data
released last year clearly did."

The risks associated with Web search data were highlighted last August when
AOL inadvertently exposed on the Internet the search history of more than
650,000 of its users. The move prompted widespread criticism from privacy
advocates and Congress and the filing of a complaint against AOL with the
Federal Trade Commission, as well as the firing of two AOL employees and the
resignation of its chief technology officer and a class action lawsuit.
The lawsuit was later dismissed because AOL's user agreement requires
lawsuits filed against it to be filed in its home state of Virginia, said
Bankston.

Google said it can't anonymize the entire IP address, delete it altogether
or anonymize any of it sooner than 18 months because it needs the data to
analyze usage patterns and diagnose system problems. For example, Google
uses the information for fraud detection and prevention and to combat denial
of service attacks that can temporarily cripple or shut down servers.
"Knowing what country a user is coming from helps us figure out whether or
not we are delivering the right search," said Nicole Wong, deputy general
counsel for Google.

Why wait 18 to 24 months?
The proposed timeframe for retaining the data in identifiable form was
chosen because data retention laws in Europe require communications service
providers to hold on to the information for that long, according to Wong.

If governments in various countries enact legislation to require
communications service providers to archive Internet traffic for specific
lengths of time including up to four years, as some are considering, Google
would then have to abide by the laws, the company said. Law enforcement will
still be able to subpoena server log data after it is anonymized, but every
request will be considered on its merits and Google's response will depend
on how narrow the request is and how relevant the data is to the
investigation, Google said.

However, EPIC's Rotenberg said two years is that "top end" of the timeframe
European governments are seeking for requiring communications service
providers retain user data.

Google said it is making the changes in response to feedback from privacy
activists and others, including the Norwegian Data Protection Authority with
which Google executives met in January. A team of Google executives,
engineers and lawyers were involved in establishing the new policy, Wong
said.

Of all the major search engines, Google is the only one to publicly disclose
that it has fought government efforts for consumer data. A year ago, Google
challenged a subpoena received by the U.S. Department of Justice that
ordered Google to hand over a random sample of a week's worth of search
terms and one million Web pages from its index to aid in the Bush
administration's defense of an Internet pornography law. One month later a
federal judge sided mostly with Google and ordered the company to provide
half of the amount of Web pages sought but none of the search queries.

The new policy would not have affected that situation because the government
was not asking for searches tied to individual IP addresses, Wong said.


Copyright ©1995-2007 CNET Networks, Inc. All rights reserved.




More information about the Infowarrior mailing list