Key takeaways
Tokenization enhances real-world evidence integration: Tokenization enables researchers to securely link clinical trial data with real-world data sources such as EHRs, claims data, and registries—providing a more comprehensive view of patient journeys without compromising privacy.
Implementation requires careful planning and compliance: Successful tokenization demands robust governance, informed patient consent, interoperability across data platforms, and strict adherence to global data privacy regulations such as HIPAA and GDPR.
Potential to transform clinical research efficiency and quality: Tokenization can simplify long-term follow-up, enable the creation of external control arms, improve real-world outcomes analysis, and support decentralized trial models—ultimately accelerating trial timelines and improving the relevance of study findings.
In an era of digital health, the intersection of data privacy and innovation has become increasingly relevant. More and more, clinical researchers are looking to real-world evidence (RWE) for a more comprehensive view of the patient journey and a better understanding of patient outcomes, treatment patterns, and the long-term impact of therapies.
Since the launch of the FDA Real-World Evidence Program (a response to the 21st Century Cures Act mandate), the use of RWE to support safety and efficacy evaluations in drug applications for regulatory decisions has grown on a global scale.
Of course, the introduction of RWE into clinical research necessitates strict adherence to privacy laws in order to protect sensitive patient information. Tokenization, a process that replaces patient identifiers with unique, encrypted codes, has become vital for linking data from different sources while maintaining privacy.
“Tokenization is a foundational tool we can leverage to unlock the full value of real-world data (RWD),” said Mike D’Ambrosio, senior vice president, global head real world research at Parexel.
Tokenization allows researchers to connect clinical trial data with external sources like electronic health records (EHRs) and its trove of potential insights.
Administering tokenization effectively, however, entails several challenges and considerations, including managing consent, ensuring regulatory compliance, and maintaining data security and interoperability.
How tokenization works
Tokenization transforms personally identifiable information (PII)—direct and indirect identifiers, including name, Social Security number, address, birthdate, and race, gender, age, and location—into a unique identifier. These “token” identifiers have no meaningful value except deidentifying PII in uniform fashion, ensuring it remains inaccessible to unauthorized parties. A unique code or key generated using available PII, such as patient name, birthdate, and address, facilitates the linkage of disparate de-identified datasets.
By matching tokens, researchers can link data sources such as clinical trial data, EHRs, claims data, and registries, creating a more comprehensive view of patient journeys without ever exposing the patient's PII.
Ryan Moog, head of solutions at Datavant, works to connect clinical data and RWD for life sciences companies. “By replacing sensitive personally identifiable information (PII) and protected health information (PHI) with encrypted tokens, tokenization allows organizations to unlock the full potential of their data while preserving patient privacy,” said Moog.
Unlocking insights, streamlining research
The potential benefits of tokenization are significant. By linking data from different sources, tokenization provides a more complete picture of patient outcomes and treatment effectiveness, which can improve study designs, speed drug development, and shorten the time-to-evidence cycle for payer discussions.
- Simplify long-term follow-up: Tokenization enables researchers to track patients' health information over time without requiring constant site visits, reducing the burden on both patients and sites, and providing valuable insights into long-term safety and efficacy.
- Optimize study design: Linked RWD can inform and optimize future clinical trial design and site selection by providing insights into patient cohorts, recruitment strategies, and relevant endpoints.
- Generate external control arms (ECA): In rare diseases and oncology, tokenization can help create external control arms from RWD, potentially reducing trial size and accelerating timelines.
- Improve understanding of real-world outcomes: Researchers can analyze how treatments perform in real-world clinical practice, going beyond the controlled environment of a clinical trial.
“There are multiple advantages to tokenizing trial participants, from gaining a better understanding of how clinical interactions outside the trial impact outcomes, to capturing a therapy’s long-term value, to accelerating health economics and outcomes research (HEOR) studies,” said Joe Walsh, vice president at Komodo Health.
Tokenization can also contribute to improved interpretability and generalizability in clinical research.
For example, tokenization can facilitate remediating missing data or losses to follow-up, extending follow-up beyond trial closeout, and characterizing the applicability of trial results to under-represented groups. These enriched data sets provide more context for comorbidities, adherence, and real-world outcomes.
In practice, the application of tokenization and the integration of diverse datasets have been shown to significantly enhance the comprehension of treatment pathways and patient outcomes.
“While not a silver bullet, tokenization can provide a safety net to either back-fill missing follow-up information or indeed enhance or enrich follow-up data, leading to improved longitudinal understanding of safety, efficacy, and patient benefit,” said D’Ambrosio of Parexel.
Considerations and challenges
The successful integration of RWD into clinical research requires thoughtful consideration—as well as technical expertise and experience.
A fundamental requirement for navigating this complex landscape is consent. It’s essential to gain comprehensive stakeholder support, based on transparency with the study’s objectives and benefits. It’s important that everyone understands the value of linking data, and organizations ensure integration of data-linking considerations into the study design.
Compliance is a second key factor that must be prioritized in the execution of tokenization. Organizations must comply with data privacy regulations like HIPAA in the US and GDPR in Europe, as well as guidance from the FDA. Naturally this entails a thorough and current understanding of all relevant regulatory requirements.
Another primary challenge for successful tokenization hinges on technical expertise and data governance. Establishing robust data governance frameworks is necessary to ensure the security, privacy, and integrity of tokenized data.
A related consideration is the importance of interoperability and suitability. Data sources need to be compatible for tokenization to be effective. And it's important to assess the relevance and reliability of RWD sources to ensure they are suitable for the study's objectives, e.g., ensure the data is fit-for-purpose.
Looking to the future
Integrating clinical trial data and RWD through tokenization is a significant advance in clinical research. By maintaining patient privacy while linking diverse datasets, researchers can realize invaluable insights into patient journeys, treatment outcomes, and healthcare utilization.
"The insights this enhanced data provides are helping the industry reach a synergy between R&D and commercial throughout the development lifecycle,” said Robert Zambon, vice president, product, at ConcertAI. “It’s accelerating timelines, laying the groundwork for new assets, and bringing much needed therapies to market while providing unique value to both sponsors and patients."
The convergence of tokenization with advanced data analytics, artificial intelligence, and cloud computing has transformative potential across a range of areas, including:
- Precision medicine: By securely linking tokenized genomic data with clinical outcomes, researchers can develop personalized therapies and treatment pathways.
- Public health surveillance: Tokenized datasets can support real-time disease tracking and outbreak prediction without violating patient privacy.
- Post-market surveillance: RWD enriched through tokenization can help monitor drug safety and effectiveness after approval, identifying adverse events in real-world settings.
- Decentralized clinical trials: Tokenization facilitates secure remote participation by allowing patient data to be collected, linked, and analyzed without geographic or institutional constraints.
With all the potential benefits of tokenization, how should research organizations proceed?
Mariah Baltezegar, vice president and general manager, Thermo Fisher Scientific, advises implementing tokenization at the beginning of studies to realize cost and time savings.
“It’s critical to tokenize early—even if you’re unsure whether you’ll link with future data,” said Baltezegar. “Building that foundation upfront creates the potential for future connections without placing additional burdens on sites or patients later. Tokenize early, tokenize often. Once a trial concludes, going back to reconsent and collect the necessary PII is far more complex and disruptive.”
In a digital age that prioritizes patient privacy, tokenization makes innovative clinical research possible. By decoupling personal identifiers from health data, tokenization can be an essential tool to help researchers satisfy regulatory requirements while opening the door to powerful RWE-based insight. As healthcare systems increasingly adopt RWD and RWE approaches, tokenization will serve as both a shield for privacy and a bridge to innovation.
When implemented with rigor, transparency, and interoperability in mind, tokenization has the potential to revolutionize how we conduct research, deliver care, and improve public health.
Brad Cunningham, head of strategic partnerships at Verana Health