The cars were supposed to collect the locations of Wi-Fi access points. But Google also recorded the street addresses and unique identifiers of computers and other devices using those wireless networks and then made the data publicly available through Google.com until a few weeks ago.
The French data protection authority, known as the Commission Nationale de l'Informatique et des Libertés (CNIL) recently contacted CNET and said its investigation confirmed that Street View cars collected these unique hardware IDs. In March, CNIL's probe resulted in a fine of 100,000 Euros, about $143,000.
The confirmation comes as concerns about location privacy appear to be growing. Apple came under fire in April for recording logs of approximate location data on iPhones, and eventually released a fix. That controversy sparked a series of disclosures about other companies' location privacy practices, questions and complaints from congressmen, a pair of U.S. Senate hearings, and the now-inevitable lawsuits seeking class action status.
A previous CNET article, published June 15 and triggered by the research of security consultant Ashkan Soltani, was the first to report that Google made these unique hardware IDs--called MAC addresses--publicly available through a Web interface. Google curbed the practice about a week later.
But it was unclear at the time whether Google's location database included the hardware IDs of only access points and wireless routers or client devices, such as computers and mobile phones, as well.
Anecdotal evidence suggested they had been swept up. Alissa Cooper, chief computer scientist at the Center for Democracy and Technology and co-chair of an Internet Engineering Task Force on geolocation, said her 2009 home address was listed in Google's location database. Nick Doty, a lecturer at the University of California at Berkeley who co-teaches the Technology and Policy Lab, found that Google listed his former home in the Capitol Hill neighborhood in Seattle.
"It would be helpful to have some clarity about why and how (a hardware address) got in there so people can act accordingly," says Soltani, the security researcher.
Google declined repeated requests for comment for this article over a period of more than a week. In a statement last month, the search company said only that "we collect the publicly broadcast MAC addresses of Wi-Fi access points," which addressed only current and not past practices.
Google does not provide any method, sometimes called an opt-out mechanism, that would allow people who don't want their unique hardware IDs in the database to remove them. Instead of using Street View cars, Google new "crowdsources" its location database by using Android phones.
The most likely explanation of how the Wi-Fi devices were included is the simplest: Just as an accident of programming led to Street View cars collecting (in relatively few cases) the contents of unencrypted wireless communications, client hardware addresses were also vacuumed up. Then they were added to Google's geolocation database, which was publicly available without access restrictions until late June.
Wi-Fi-enabled devices, including PCs, iPhones, iPads, and Android phones, transmit a unique hardware identifier to anyone within a radius of approximately 100 to 200 feet. If someone captured or already knew that unique address because they had access to the device, Google's application programming interface, or API, revealed where that device was located, a practice that can reveal personal information including home or work addresses or even the addresses of restaurants frequented.
To be sure, it's not always easy to learn a target's MAC address. It's generally not transmitted over the Internet. But anyone within Wi-Fi range can record it, and it's easy to narrow down which MAC addresses correspond to which manufacturer. Someone, such as a suspicious spouse, who can navigate to the About screen on an iPhone can obtain it that way too.
Kim Cameron, Microsoft's chief identity architect until earlier this year, had long suspected that Street View cars vacuumed up the hardware addresses of devices using a Wi-Fi connection. In a June 2010 essay that analyzed an independent report (PDF) of Street View data collection, Cameron said he believed that Google recorded the locations and MAC addresses of far more than just fixed Wi-Fi access points.
Marc Rotenberg, head of the Electronic Privacy Information Center in Washington, D.C., said he has concerns about the legality of intercepting the hardware addresses of devices using Wi-Fi connections.
"The fact that other companies such as Skyhook may have engaged in this behavior, which seems to be Google's best defense, doesn't make it lawful," Rotenberg said. "What it does suggest is that there's more to the investigation of Street View."
In the U.S., the Federal Trade Commission ended its investigation of Street View's accidentally-broad data collection last October without levying a fine.
Disclosure: McCullagh is married to a Google employee not involved in this issue.
Starting in April, CNET posed a series of questions to Google that have gone unanswered. A Google spokesman said that "we have talked extensively about geolocation privacy, before two senate panels and elsewhere," but declined to elaborate. Here's an abbreviated list of our questions:
- How was this database of client MAC addresses assembled -- Street View cars or Android phones, or both?
- Are you still adding to this database of client MAC addresses, or was its collection halted as part of the response to Street View concerns? If you did stop, when?
- Are you planning to continue to offer this geolocation data? If not, will you irrevocably delete it?
- For what countries have you collected geolocation data of client MAC addresses?
- How many client MAC addresses do you have in your geolocation database?
- The 802.11/WiFi header allows client devices to be differentiated from access points (APs are reported in the BSSID field of the WiFi header). Why did Google record and make available client device geolocation information rather than access point geolocation info?
- Can you tell me how many client (non-AP) MAC addresses you have in your database?
- It appears that iPhones in AP mode use addresses in the unassigned range of B2:27:E4, which is not the normal Wi-Fi client MAC address space. Why don't you filter that range?
- Is Street View the only mechanism through which client (non-AP) MAC addresses were added to your geolocation database?
- When did you cease adding client MAC addresses to your database?
- Why did you not scrub client MAC addresses, which aren't useful for geolocation, from your database after the CNIL's investigation?
- Have you permanently deleted client MAC addresses from your database after my initial CNET article appeared?
- Why doesn't Google randomize those two 16-byte strings (let's call them the device ID) on an hourly or daily basis? [Ed. Note: This is a reference to two 16-byte strings representing a device ID unique to each phone.]
- Given a street address or pair of GPS coordinates, is Google able to produce the complete location logs associated with that device ID, if legally required to do so?
- Given a device ID, is Google able to produce the complete location logs associated with it, if legally required to do so?
- Given a MAC address of an access point, is Google able to produce the device IDs and location data associated with it, if legally required to do so?
- How long are these location logs and device ID logs kept?
- If they are partially anonymized after a certain time, how is that done, and can those records be restored from a backup if Google is legally required to do so?
- How many law enforcement requests or forms of compulsory process have you received for access to any portion of this database?
- How are the device ID strings calculated?
- If Google knows that a Gmail user is connecting from a home network IP address every evening, it would be trivial to link that with an Android phone's device ID that also connects via that IP address. Does Google do that?