Microsoft Data Collection Proposal: Don't Trust Us? Trust Someone Else!
Privacy has become a major concern for many individuals and businesses -- to the aggravation of many software companies. Vendors of operating systems, antivirus, and security tools want data from users' machines to better understand trends and patterns. But many people dislike being personally identified, so they won't let packages phone home. A Microsoft (MSFT) patent application filed in September 2008 and made public last week, called COLLECTING ANONYMOUS AND TRACEABLE TELEMETRY, suggests that a third-party certificate mechanism could assure user anonymity during data collection:
A method implemented at least in part by a computer, the method comprising: establishing a trust relationship with an escrow certificate issuer operable to issue certificates for use in providing telemetry data; receiving a certificate from a telemetry source, the certificate including information usable to verify that the certificate is valid but not usable to determine an entity that controls the telemetry source without additional data not included in the certificate; and determining whether the certificate is valid.According to the application, certificates would ensure user privacy while giving vendors access to the information they need. A third party could protect user identities in "software crash dump collection, software quality metrics collection, virus and attack detection statistics, reputation telemetry that includes URLs and IP addresses associated with attackers, and the like." (Microsoft uses the term telemetry for remote data acquisition.)
It's an interesting approach, but I see four problems. If people don't trust the vendors, why would they necessarily trust the certificate provider? Who's going to issue the certificates? Microsoft? And if users don't trust the certificate issuer, they won't let information leave their computers.
I think there is also a strong potential for anonymity to be compromised. Only a few years ago a researcher proved that the combination of gender, zip code, and birth date uniquely identified 80 percent of the nation's populace. Given the amount of data available on purchases, a specific collection of software and hardware configurations married with an IP address or even an email address might let the certificate issuer identify the user by name, even if a specific vendor couldn't. Are users supposed to trust that it's not possible until proven otherwise? (A side note: notice how many companies look for "only" gender, zip code, and birth date? Ever wonder why?)
Third -- with a nod to the horse led to water -- even if this did guarantee user anonymity, why should anyone bother? There is nothing inherently in it for the user, and getting something in return would probably mean some form of self-identification to receive the reward, negating the privacy guarantee.
Finally, Microsoft could find it difficult to obtain a patent on the application. Identity verification and security mechanisms often use unique certificates obtained by a user from a trusted issuer. The only difference here is that the vendor doesn't get to associate users with certificates. Seems like it should fail the obviousness test.
Images: RGBStock.com users arinas74 and lusi, site standard license.