
Using De-Identified Data to Rethink Trial Design and Site Strategy
Explore how large-scale, de-identified real-world datasets enable more representative trial design, improve site selection, and support patient identification beyond the limits of traditional clinical study populations.
In a recent video interview with Applied Clinical Trials, Jen Lamppa, vice president of commercial strategy at Inovalon, discussed the clinical operations impact of the FDA’s evolving guidance on real-world evidence submissions using de-identified patient data. Lamppa explained the critical distinction between pseudonymized and anonymized data and outlines how large, de-identified datasets are reshaping trial design, site strategy, and patient selection. She described where real-world evidence most effectively complements traditional trials—particularly in observational and post-market settings—while highlighting the operational, data governance, and methodological hurdles that still limit broader regulatory adoption. Lamppa concluded by explaining how real-world evidence is poised to augment, rather than replace, traditional trials by enabling smarter, more efficient, and more representative evidence generation.
Editor's note: This transcript is a lightly edited rendering of the original audio/video content. It may contain errors, informal language, or omissions as spoken in the original recording.
ACT: How might large, de-identified datasets change trial design, site strategy, or patient selection?
Lamppa: What’s great about large de-identified datasets is they allow you to see beyond the subset of patients that you would have historically recruited and studied for clinical research.
When you think about a traditional clinical study, you decide on the patient population you want to look at. There are usually inclusion and exclusion criteria applied to varying degrees of rigor, depending on the intended purpose of the study. Then you select specific research sites, onboard patients, screen them, enroll them in the study, and follow them through. That process usually results in a very powerful dataset, but on a fairly narrow subset of a total patient population.
De-identified real-world datasets are kind of the opposite end of that spectrum. You get a very broad view and a broad capture of a patient population. That’s obviously very powerful for multiple reasons.
Sponsors and CROs have already developed competency and are leveraging tools that allow them to inform more realistic and representative inclusion and exclusion criteria, optimize protocols based on real-world treatment patterns, identify sites with enrollable patients, and understand underlying disease demographics and outcomes to ensure studies deliver evidence that represents patients who need treatment.
All of those use cases are already in play. The area that I’m most excited about—and that many of us are most excited about—is the application of these de-identified datasets to the actual evidence package.
Newsletter
Stay current in clinical research with Applied Clinical Trials, providing expert insights, regulatory updates, and practical strategies for successful clinical trial design and execution.



