How to Protect Privacy And Grow the Genomics Industry

Last Updated Apr 14, 2010 8:26 PM EDT

Privacy is one of the key issues looming over the nascent genomics industry. Fortunately, a new framework proposes a reasonable way to balance the need of businesses to share data with consumers' desire to control how their genetic information is used.

The ability for companies to share patients' genome data -- which often includes biographical information, family medical history, and current diagnoses -- is vital to the development of the next generation of personalized medicine products and services. But the big fear is that insurance companies might use such data to raise premiums. New research published this week in the Proceedings of the National Academy of Sciences proposed an immediate and practical privacy solution.

According to Daniel Vorhaus of the Genomics Law Report, "when it comes to associating genes with medical conditions, researchers rely on International Classification of Disease (ICD) codes to categorize individual patients by disease type and search for shared genetic variations that might play a causal role. This generally means removing any links to identifying biographical information." Nevertheless, the biological information often gets out anyway. That's not good, because a new survey covered in the Wall Street Journal claims that 68% of survey respondents reported some degree of worry about what happens to their personal information once it's stored in a doctor's computer. Fifteen percent of the 1,849 adults surveyed said they'd conceal information from a physician if "the doctor had an electronic medical record system" that could share that info with other groups. Another 33% would "consider hiding information."

The new research paper, Anonymization of electronic medical records for validating genome-wide association studies, proposes an interesting technological solution: The report explains:

Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual re identification, while supporting GWAS validation and clinical case analysis tasks.
Basically, the program disguises parts of a person's medical history data that are not relevant to a geneticist's research question, using an algorithm that combs through health records to retrieve only the data that is relevant to the question. For example, if a scientist wants to examine links between certain genes and a person's history with heart disease, only the parts of the medical record that pertain to heart disease remain intact. The algorithm changes the medical code for other diseases, as well as identifying biological information.

While there is no foolproof way to protect consumer privacy, this solution strikes a balance that's good for business, because it allows companies access to vital and useful data. It also addresses consumer concerns that they could be linked to compromising genetic test results. Now, which IT vendor will put theory into practice by actually building such a system?

Photo Source: Mikey G Ottawa's photostream