TransCelerate’s Data Transparency Initiative

August 6, 2015
Moe Alsumidaie

Applied Clinical Trials

Ben Rotz, Director of Medical Transparency at Eli Lilly & Company discusses TransCelerate’s Data Transparency Initiative and position paper regarding patient data de-Identification and anonymization.

TransCelerate recently developed a position paper on Data De-Identification and Anonymization of Individual Patient Data in Clinical Studies to establish consistencies in structuring clinical trial data for researchers and the scientific community.  In this interview, Ben Rotz, Director of Medical Transparency at Eli Lilly & Company discusses TransCelerate’s Data Transparency Initiative and position paper.

 

Moe Alsumidaie: Why did TransCelerate develop the Data De-Identification and Anonymization of Individual Patient Data in Clinical Studies Position Paper?

Ben Rotz:  We recognize, as an industry, that there is an unmet need for having a consistent way to ensure that privacy is protected in not only clinical trial and individual level patient data, but also in our clinical documents, such as clinical study reports (CSRs), which are shared publicly.  We created a model approach to data protection

through TransCelerate to drive efficiencies within member companies and for clinical researchers, as well as ultimately increasing transparency for patients.  The Clinical Data Transparency Initiative has an objective of developing a consistent approach for redacting privacy information in CSRs, and the de-identification/anonymization of patient level data for the broader healthcare community. 

Discussion around these topics could be attributed to

.  We really want to have a consistent way for researchers to look at anonymized data from numerous companies as this approach will allow researchers to take data from one or more companies and perform additional data analyses. We want researchers to maximize utilization from clinical trial data as patients, investigators and researchers have put so much of their lives and efforts into these studies.

MA: The Data De-Identification Position Paper indicates that site, investigator and patient records must be associated with a randomly generated number (and in some instances are irreversibly destroyed), and that PII and PHI must be redacted.  Can you describe the process and what technologies will be commonly used to efficiently randomize, redact and destroy information?

BR: Some companies utilize macros that may tie in with statistical software; these macros will determine a new randomized number that is assigned to every patient record across the entire dataset.  Throughout this process, the data are kept separate from the original datasets, and the key code macro is then destroyed  for anonymized data, or stored in a secure environment for de-identified data, after all the appropriate quality checks have been performed.  The same types of macros have offset or relative day mechanisms; some datasets, depending on the company, are already set up in relative days, so data processing with the macros becomes easy and more consistent.   Other items are de-identified /anonymized as needed such as country or age.    

With regards to other technologies, we haven’t seen many that conduct appropriate quality checks with the anonymization process; in evaluating some current technologies, we feel that these systems are not validated, inconsistent and pose risks for private information breaches.  I think as more and more companies convert to the Study Data Tabulation Model (SDTM) for collecting clinical trial data, there may be an opportunity to have more

consistency and efficiency especially if CDISC data standards are utilized.

MA: What types of enterprises in the biopharmaceutical industry does this position paper aim to address?  Who will be utilizing this position paper?

BR: Within a biopharmaceutical company the major people who will utilize this are data management, individuals associated with statistical software coding, and people responsible for privacy protection categorization.  We anticipate that academic individuals will also use this position paper; The Institute of Medicine’s (IOM) Report on Sharing Clinical Trial Data is agnostic to the sponsor of a clinical trial, so whether research is done on an academic, industry, or government setting, IOM’s recommendation is that clinical trial data should be made more available for additional research.

As you can imagine, researchers in academic settings may not have the resources available to do this type of work, so we, as an industry, can support academia by providing a framework to assist academic researchers with making their data available for sharing.

MA: What outcomes can the industry gain from making clinical trial data available?

BR: It’s not necessarily about what the industry gains; it’s more about what the scientific community gains.  We hope that clinical trial data can be utilized to its fullest extent, additionally, we aspire that there will be new information available that further defines the best way to treat a patient with any sort of disease.  We, as an industry, and as people concerned about the healthcare of individuals, want patients to get the best treatment option that demonstrates the most efficacies and the least amount of side effects.

MA: What is the benefit of partaking in this initiative?  How can companies protect their intellectual property (IP) with clinical trial data sharing?

BR: From an industry perspective, the need to share clinical trial data is something that’s here to stay and has been seen through various programs. For example, the European Medicines Agency announced almost three years ago that it will proactively publish clinical-trial data and enable access to full data sets by interested parties. Overall, the biopharmaceutical industry accepts and agrees that sharing data is the right thing to do based on patient feedback, our desires to give back to the scientific community and advance collaboratively.

For enterprises not already sharing data, why would they start from scratch? Why would you not want to take something that’s been put together by many different companies, utilize it and have consistency in analyzing the anonymized data? 

If you are a member of BIO, PhRMA, or EFPIA, you are committed to data sharing because every member of these organizations has committed to Responsible Sharing of Clinical Trial Data; TransCelerate’s position paper is an aspect of that, it’s a mechanism for that to occur moving forward. 

Sharing clinical trial data occurs after you’ve had approval to market medical products from both U.S. and EU regulatory authorities.  Sponsor companies are individually responsible for protecting their own IP during the approval process, as well as ensuring that they have the appropriate data protection in place.  From my perspective, clinical data sharing is not about protecting IP, it is about increasing transparency to benefit research, and ultimately patients.

From a processing standpoint, the most efficient way to anonymize data may be during a database lock; as you lock the data you can create an anonymized dataset at the same time.  Another method that many companies are adopting is to go back in time on older datasets.

MA: What motivates and drives you to do this sort of work?

BR: It is my philosophy that patients who have dedicated effort, time and parts of their lives into the clinical trials process should realize the fullest outcomes out of these contributions.  Establishing efficiencies in those contributions through data sharing is the right thing to do in recognition of everything that these patients have put into clinical trials.