Watch CBS News

How a Facebook app scraped millions of people's personal data

Facebook role data misuse
Reporter: Facebook "not showing much accountability" 04:58

Cambridge Analytica, the data firm hired by President Trump's 2016 campaign, was involved in the harvesting of personal data from over 50 million Facebook users, according to news outlets such as the The New York Times, The Guardian's Observer and Channel 4 News in the U.K. And it gathered that data in a matter of months.

The firm, which was suspended by Facebook over the weekend after Facebook alleged it lied about deleting illicit data, leaned on a third-party firm to create quizzes and surveys inside the social network designed to engage users, then used artificial intelligence systems to build "psychographic profiles" about voters, said co-founder and head of product Matt Oczkowski in a 2017 interview with CBS Interactive's TechRepublic.

"We most certainly do build [AI] tech in house," Oczkowski explained to CBS' Tech Republic. He said that the so-called psychography profiles combined Facebook data with information gathered from other "top commercial data providers" and that data included specific information about voter "demographics, geographics, purchase history, and interests."

When a Facebook user engaged with the survey, he said, the user implicitly gave the company access to a broad spectrum of personal data that Facebook provides for advertisers. The app reached further, giving the company access to profile data from people in the users' broader circle of Facebook friends who had, wittingly or otherwise, set their security setting relatively weakly. Though the company likely did not use a traditional "hack" to access user information, they may have violated Facebook's terms of use by bulk harvesting and repurposing user data.

When asked about how the company chose what to include in Facebook surveys, Oczkowski would only say that, it collects data from "exclusive relationships" with data vendors "and through direct response projects."

Cambridge Analytica pushed back Saturday against the notion that it harvested any data, insisting it had contracted a firm in the U.K. to do research. Cambridge Analytica insisted that when it learned that it had been sold data it shouldn't have, the firm deleted the data.

"Inner Demons"

Getting users to interact with the survey was key to unlocking the data. A source close to the 2016 Trump campaign and familiar with Cambridge Analytica's tactics explained that the surveys used by the company used inflammatory language designed to provoke the "worst tendencies" in Facebook users -- meaning that it would stir up the kind of emotion that would prompt an interaction. Cambridge Analytica said Saturday it did not use any Facebook data for the campaign. Mr. Trump's 2016 digital guru, Brad Parscale, told "60 Minutes" last year that the campaign did not use Cambridge Analytica's controversial practice of psychographics.

Cambridge Analytica specializes in psychographics, which microtargets ads based on personality. In the words of whistleblower Christopher Wylie: "We exploited Facebook to harvest millions of profiles. And built models to exploit that and target their inner demons."

That allowed the company to scale up their data gathering rapidly. Wired reporter Issie Lapowsky told CBSN Saturday that by convincing 270,000 people to interact with the app, GSR managed to tap into the data of over 50 million users.

CEO Alexander Nix described the firm as "fundamentally politically agnostic and an apolitical organization," and even tweeted Saturday that President Obama's 2012 "pioneered microtargeting."

But Oczkowski contradicted that assertion, and explained that "we served as the data agency of record, but our role quickly evolved as the cycle progressed."

The company developed "three pillars" of exclusive technology, Oczkowski said, including data science and analytics, digital marketing designed for persuasion and get-out-the-vote (GOTV), and polling research. "Having a large amount of control and input into each of these three areas allowed us to be extremely efficient and reactive," he said. "It also allowed us to easily integrate with the staff at Giles-Parscale and the RNC."

When asked specifically how the company leveraged Facebook for other data, Oczkowski explained that the company's data analytics prowess assisted the Trump campaign for everything from resource allocation, calculating the most efficient candidate travel stops, advertising, and the language used in surrogate speeches, and even "personalizing messaging to the individual voter."

"Huge exaggeration"

Within GOP data circles, however, the company's extravagant claims about the effectiveness of Facebook data and artificial intelligence are heavily criticized. "I'm not saying they lied," said a former Trumpworld staffer shortly after the election, "but for Cambridge Analytica to run victory laps and claim they won the election for Trump is a huge exaggeration. Data can do a lot of things, but there's a limit to how effective it is. Cambridge Analytica's claims went far beyond that limit."

Cambridge Analytica is not alone in their use of Facebook and social media to target consumers with advertising. Machine learning is one of the fastest growing sectors of business technology, and a number of firms have adopted similar tactics. Amazon, Google, Twitter, and a host of other cloud-based Internet giants are hiring AI experts and building algorithms that automate advertising.

It's no surprise that cloud giants are actively using big data analysis models to aid private industry in everything from cybersecurity to marketing. Oczkowski described the potential application actionable data scraped from Facebook users as potentially "endless."


Dan Patterson is a senior reporter with TechRepublic.com. Follow him on Twitter @danpatterson

View CBS News In
CBS News App Open
Chrome Safari Continue
Be the first to know
Get browser notifications for breaking news, live events, and exclusive reporting.