X-SciTech

Will Data Mining Help NSA?

May 25, 2006 / 3:46 PM EDT / AP

There's a lot we still don't know — and may never know — about the National Security Agency's surveillance of Americans' phone calls. But one striking tidbit has emerged: that the agency is mining phone records for patterns of terrorist activity.

USA Today reported May 11 that the NSA was performing "social network analysis" to detect patterns of terrorist activity in its database of U.S. call records. In defending the program, Sen. Wayne Allard, R-Colo., confirmed that the White House had told him the NSA was probing calling patterns to "detect and track suspected terrorist activity."

But is that really possible?

The "tracking" part makes sense. Assuming that intelligence had sussed out suspected terrorists, certainly the vast database could be used to track whom those people had called.

The "detecting" part, however, is another story. Can terrorists be spotted simply by analyzing who calls whom and when — without any other leads?

There's reason to be skeptical.

That's because diverse kinds of human organizations share certain traits. If you and I and 17 other people are in a book club, we're likely to call each other often. Sometimes almost all of us would ring up just one person on the same day to ask, "Can I bring dessert to tonight's meeting?"

Viewed in silhouette, in the cold analysis of a computer, it might indeed be apparent from our phone records that the 19 of us frequently communicate to plan something. But further investigation would be necessary to determine just what we were up to.

Can the government dig deeper into all of these groups?

Fortunately for the stability of society, but somewhat unfortunately for intelligence analysts, there are vastly more groups of 19 people organizing soccer games and bake sales than there are teams like the 19 hijackers of Sept. 11.

"Those patterns that we leave out there when we do things are going to look the same no matter what we're doing, and 99 percent of the time we're not going to be doing anything illegal," said Valdis Krebs, who consults with companies on the organizational insights they can glean from social network analysis. "There probably isn't a pattern that's different from doing something bad vs. doing something good or something neutral."

The Pentagon apparently isn't certain of that. It has funded research into a field known as "scalable social network analysis" that aims to identify whether terrorist plotting indeed leaves different organizational patterns from planning a bake sale. But Krebs doubts that enough terrorist cells have been mapped to provide a statistically significant sample of what those patterns are.

The main point of social network analysis is to produce a map of how people in an organization tend to interact.

By analyzing e-mail traffic or interviewing members of a group, for example, network analysts can reveal the strength of ties between people in an organization, and who the key hubs are.

Sometimes that can explain who really deserves a raise. Or companies can buy social networking software that trolls through e-mail to determine who has the best contacts for a particular customer call.

Of course, these kinds of analyses benefit tremendously from the fact that organizational boundaries are openly available. Analysts know a company exists. Its employees will fill out surveys to say whether that guy in marketing is a quiet leader or a quiet malingerer.

"It helps you understand trends, but I don't know of companies that are using social network analysis to discover bad guys without an entry point, just looking at the network structure," said Jeff Jonas, founder of Systems Research and Development, a company whose software analyzed records to tip Las Vegas casinos when people barred from gambling had associates working on staff. The company attracted investment from the CIA's venture unit even before Sept. 11 and last year was acquired by IBM Corp.

"If you're trying to root out a few bad apples using data-mining to look for anomalies, it's not clear to me that this would be productive without a starting point," said Jonas, who is now chief scientist in IBM's "entity analytics" unit.

To put Jonas' point in other words: Merely mapping who Americans call likely wouldn't uncloak a terrorist cell. The necessary "entry point" would have to be if someone in the United States called or received a call from a number already suspected of being affiliated with U.S. enemies.

Jonas cites a chilling example of the process in action. In the '90s, reports emerged from Cali, Colombia, that a drug ring had identified and executed informants by getting Cali's phone records, then using a mainframe computer to compare the numbers dialed with those held by narcotics agents. It wouldn't have worked without the entry point of knowing which numbers belonged to the drug cops.

Following this chain of reasoning, another entry point could come if a group had been infiltrated somehow — whether through a spy or by a tap providing the content of phone calls or e-mails.

The New York Times reported in December that the NSA was indeed eavesdropping, without warrants on communications between suspected al Qaeda members overseas and associates in the United States. A federal lawsuit in San Francisco claims the NSA gained access to AT&T Inc. communications traffic through a secret switching room.

But while Bush administration officials haven't discussed details of the NSA database described by USA Today, they have insisted that conversations themselves aren't being broadly monitored.

These somewhat sparse details leave questions as to the extent of government data-mining efforts. They could include cross-referencing the phone database with property, court and credit files sold by private database vendors. Or they could be part of grander sweeps like the one envisioned by the Pentagon's Total Information Awareness program. It was technically shuttered in a privacy uproar but is generally assumed to be continuing in various forms.

Such uncertainty makes George Washington University law professor Daniel Solove disregard surveys such as the May 12 Washington Post-ABC News poll that found 63 percent of Americans supporting the call database as an anti-terrorism tactic.

"No one is asked in the polls, `Would you approve of anything the government can do with your information? Is it OK that the government engages in various forms of snooping into your life that you would not be told or informed about?"' said Solove, author of "The Digital Person." "It's hard to really opine on something you don't know all the details of."

One reason that social network analysis has gained prominence in recent years is that Krebs and other researchers applied the method to publicly available information after the Sept. 11 attack to map how the hijackers operated. Mohammed Atta, for example, was clearly a hub of the network.

Krebs wonders whether those kinds of analyses raised expectations that the method could be insightful even in advance of an attack — that the right dots could be connected if investigators just could gather enough dots.

"The intelligence community is dying to find the silver bullet that will prevent the next terrorist attack. Unfortunately there's plenty of vendors that will lead them on and claim that they have found it," Krebs said. "I think it's alchemy of the 21st century to be able to predict the future, whether it's terrorism or the stock market or anything."