[Infowarrior] - Data Mining: Focus on Patterns, not Individuals

Richard Forno rforno at infowarrior.org
Fri May 12 07:21:34 EDT 2006


U.S. PHONE-CALL DATABASE IGNITES PRIVACY UPROAR
DATA MINING: Commonly used in business to find patterns, it rarely focuses
on individuals
- Matthew B. Stannard, Chronicle Staff Writer
Friday, May 12, 2006
http://sfgate.com/cgi-bin/article.cgi?file=/c/a/2006/05/12/MNG0AIQRPR1.DTL

Somewhere in America, powerful computers ingest crumbs of data about your
personal life. Your income level. The kind of car you drive. Your home
address. Your credit rating. All input, assimilated and analyzed at
lightning speed.

The result: A piece of paper arrives in your mailbox offering you 10 percent
off an oil change at your local service station.

That, in a nutshell, is data mining as practiced for more than a decade by
companies around the world to target current and potential customers. The
methods have changed since the old days of reverse telephone directories and
mailing lists, but the basic objective is the same. And data mining of some
type, experts agree, is almost certainly what is behind the National
Security Agency's reportedly successful efforts to obtain the phone records
of tens of millions of Americans from private telecommunications companies.

President Bush, commenting on the program -- which administration officials
says is aimed at identifying and tracking suspected terrorists -- said that
the government is not "mining or trolling through the personal lives of
millions of innocent Americans." But security experts say the program
virtually fits the dictionary definition of data mining -- a technique for
analyzing large sets of data that American intelligence agencies have long
been developing. "The interest has been around for years and decades," said
Richard Forno, Principal Consultant for KRvW Associates, a Washington
security consultancy. "That's part of what NSA was chartered to do."

The fundamental use of data mining is to detect patterns -- in shopping
habits or the activities of the nation's enemies.

"Data mining is going through data from the past, historical data, and
predicting what is likely to happen in the future based on patterns in the
data," said Ken Bendix, president of North American operations at KXEN Inc.,
a company headquartered in San Francisco that develops data mining software
for business applications.

It is used by credit card companies to spot spending patterns that suggest a
card has been stolen and by marketing companies who use enormous databases
to target advertising.

The technique has been gaining in popularity in the private sector thanks to
advancements in computing technology and the mathematics underlying the
software, Bendix said.

"The data is very rarely at the individual level," Bendix said. "When people
are doing these data mining analyses, they don't care that you are you. They
don't care what your name is or what your social security number is. All
they care about is what group you fit into and how you relate to everybody
else out there."

Government interest in data mining increased sharply after the Sept. 11
attacks. Unlike the private sector, intelligence officials began exploring
ways to use the technique to identify and track individuals suspected of
terrorist links. In 2002, the Department of Defense, through the Defense
Advanced Research Project Agency (DARPA) launched the "Total Information
Awareness" project -- later changed to "Terrorism Information Awareness"
(TIA) to counter the impression that the program would spy on U.S. citizens.

The goal of TIA, its now defunct Web site explained, was to link certain
transactions -- applications for passports, visas, work permits, driver's
licenses, automotive rentals, airline ticket purchases, receipts for
chemical purchases -- to arrests or suspicious activities.

The program, the brainchild of President Ronald Reagan's national security
adviser John Poindexter, collapsed under public and political criticism in
2003. But the idea lived on, said Forno, who lectured on information warfare
at the National Defense University from 2001 to 2003 and participated in the
2000 White House Office of Science and Technology Policy Information
Security Education Research Project.

"TIA may have died on paper," he said. "But it got parceled out to various
other agencies, including the NSA."

The NSA's interest in what is essentially copies of tens of millions of old
phone bills is not hard to understand, Forno and other analysts said.

In theory, a powerful computer could process all those numbers and find a
link between a phone in, say, Iowa to a phone in an al Qaeda training camp
on the Pakistan-Afghanistan border -- even by way of dozens of other phones,
linkages far too scattered for a human eye to notice. And the search
wouldn't necessarily stop there.

"You have these phone numbers, you might also at a minimum run them against
credit reporting companies," Forno said. "Local state DMV records. Tax
records. Business employment records. All those other resources might help
you narrow down your search."

But while the program's defenders insist it is a crucial instrument in the
U.S. war on terror, some private security experts question its usefulness.

"We're looking for a needle in a haystack," said Bruce Schneier, a security
technologist and chief technology officer of Counterpane Internet Security
Inc. in Mountain View. "Dumping more hay on the pile doesn't necessarily get
you anywhere."

Even before Sept. 11, Forno noted, the NSA intercepted information
suggesting a terrorist attack was imminent -- but failed to connect the dots
in time. The New York Times reported in January that most of the leads
generated by NSA surveillance of phone calls in the months after Sept. 11
led nowhere.

In addition, said Forno, with multiple government agencies now using data
mining techniques, the temptation exists for them to use information
gathered to fight terror for completely unrelated criminal investigations.

"I don't want to see that data mission creep," he said. "I think that is a
very real potential problem."

In August, the Government Accountability Office reported that of five data
mining efforts used by federal agencies, none fully complied with Office of
Management and Budget guidance for assessing privacy impacts.

But data mining experts also say the technique can greatly benefit
government agencies -- if it is used correctly and the agencies are mindful
of privacy issues.

"I just concluded an audit of the Department of Homeland Security for the
Office of Inspector General," said Jesus Mena, an Alameda-based data mining
consultant. "We frankly found that the DHS is not doing enough (data
mining)."

While he was concerned by the kind of privacy compromised suggested by media
reports on the NSA program, Mena said, more and better use of data mining
could be especially useful for terrorism-related countermeasures like
monitoring shipboard cargo and border security.

"It would mean that there would be a safer environment, and I think we are
heading in that direction," he said. "It's just a matter of time."

But the problem with applying data mining techniques to terrorism, Schneier
argued, is that terrorism is so rare, and the databases being mined are so
large, that false positives are inevitable and often more common than truly
accurate results.

And unlike using data mining to spot credit card fraud, where at most a
false positive triggers a worried call from Visa to a cardholder and perhaps
a temporary suspension of the card's use, a false positive in a terror
investigation can put an innocent person in jail, he said.

"If you believe in this nonsense, the goal is to get everything," he said.
"They're looking for these fanciful connections. So if there's a bad guy who
walks down the street and 1,000 people walk next to them, are they all under
investigation?"

Despite administration assurances that the NSA program is both legal and
mindful of civil liberties, Schneier said he also fears the government may
at least be tempted to approach cell phone companies, credit card companies,
Internet service providers -- almost any industry with a major database.

"Because more and more of our daily lives are mediated by computers ... we
leave electronic footprints everywhere we go," he said. "What the government
is doing is sucking up all those footprints."

Despite the concerns and criticisms, data mining as a counterterrorism tool
is probably here to stay, said Steven Aftergood, who directs the Federation
of American Scientists' Project on Government Secrecy.

"I think it is a technology and an approach that has enormous potential, and
one that is likely to be a continuing part of the toolkit," he said. But, he
added, "Nothing is more important than preserving constitutional
protections. And it is disturbing to learn that the intelligence community
is far out in front of what the public has consented to."
How data mining works

Data mining is the process of collecting large amounts of data from
different sources and perspectives and then searching for patterns within
the data using computerized tools such as statistical analysis and modeling
with the goal of identifying significant relationships and predicting future
trends and events.

Examples: Data mining is used in the private sector in a number of ways,
including tracking patterns in credit card activity that suggest fraud,
targeting mailed advertisements based on the recipient's past purchases,
income level and demographics, or suggesting purchases at an online store
based on your past purchases and Web browsing history. Government uses of
data mining include analyzing scientific and research information, detecting
criminal activity or patterns and analyzing intelligence or detecting
terrorist activities.

E-mail Matthew B. Stannard at mstannard at sfchronicle.com.

Page A - 1
URL: 
http://sfgate.com/cgi-bin/article.cgi?file=/c/a/2006/05/12/MNG0AIQRPR1.DTL 




More information about the Infowarrior mailing list