AI and ML are Transforming Clinical Research Practice

Published on: 

Three case studies showcase ability of AI and ML in overcoming challenges with data, resourcing, and more.

Clinical trials serve a key function in drug development, ensuring the safety and efficacy of new treatments. In pursuit of faster discovery to inform evidence-based care, clinical researchers are exploring new digital solutions which provide reliable insights and through real-world data (RWD) that can inform and facilitate the design of clinical studies with input by provider collaborators, straight from the source.1

While there are broad ranging applications of AI and ML to create a multitude of solutions, from a practical and operational perspective, several specific methods have emerged as potential opportunities for improved clinical trial efficiency and success:

  • Determination of site-study feasibility and/or understanding a patient population
  • Patient matching for in-flight studies
  • Rescue of failing studies by patient finding

The use of AI and ML to accelerate medical research is well supported, and now evidence based itself, as several successful real world case studies have emerged. This article will look at three such cases where traditional process concerns in clinical research practice have been addressed. Despite a range of issues, including resourcing, systems compatibility issues, and incomplete and unstructured data, AI and ML improved on currently available methods.

Case study 1: Feasibility

Sites, sponsors and CROs alike struggle with consistency and communication in their approach to assessing trial feasibility, a key step in site selection and study acceptance. There are numerous platforms and methods, all of which are resource intensive and inefficient, including necessary access to multiple, decentralized systems on both ends of this process, leading to incongruity in the data. Absence of system standardization causes repetition of the same questions, which are often outdated understandings of industry regulations, posed to centers, complicating an already time-consuming process.1

The typical site feasibility assessment questionnaire can take three hours of touch time and over a week of turnaround time to complete and return. This process is limited by a lack of consistent resources and patient population estimations, which can be inaccurate or out of date. Additionally, there is a lack of transparency in sponsor processes, leading to overpromising and underdelivering and harming site-sponsor relationships.


In one instance, a US regional, community-based practice consisting of 12 locations, 35 providers, and a catchment area ranging over 80% of its state population, has a consistent stream of clinical research study opportunities. They partnered with a vendor providing technology that allowed them to conduct study selection more efficiently and accurately. The vendor evaluated these centers to accelerate the selection of clinical trials that could be made available as a care option. This resulted in the activation of trials that provided clinicians to offer appropriate clinical studies as a part of the palette of treatment options.

An initial state assessment revealed their largely manual process, which was quantified as requiring an average of 2.5 hours of manager time to collate data and complete each feasibility survey. Given the volume of trials offered to this site, this work monopolized a significant portion of an FTE.

The solution was to deploy a combination of the vendor’s software with supplemental concierge support services. Using their proprietary patient finding tool, the software cut the data collection time down to minutes, reducing the time allocated to the completion of surveys significantly. Evaluating the time spent post application of this AI/ML, the site recognized a 90% time-savings on feasibility survey completions. The technology permitted a more accurate and efficient evaluation of the current patient population, allowing a quicker time to yes/no decisions on acceptance of a study. Additionally, the improved accuracy also served to improve their bottom line, simply by selecting studies that were truly feasible for enrollment. By eliminating cost-intensive studies with no realistic patient population to accrue, the site could reallocate resources to additional clinical research operational duties. This change in work assignment allowed staff to provide more time for patient interaction, which had a small but real retention benefit during a time of industry-wide staffing struggles. Lastly, implementation of this technology can help to address the loss of trust in the relationships with sponsors, which will facilitate future growth in a trial portfolio. In this instance, the center enabled a favorable reputation with the trial sponsor as a reliable, efficient and active enroller, which is critical given the data-driven site selection process utilized by CRO’s and sponsors.

Case study 2: Optimization of enrollment efforts

Over 400,000 active studies are currently being conducted across many therapeutic areas and patients should be able to find a trial for nearly any condition or diagnosis.2 However, sites report patient recruitment represents the lion share of administrative and operational burden.3 Further, the discovery of molecular and genomic elements in disease diagnosis and treatment has only increased the complexity of trial conduct.

In this next case, a Southwestern US-based oncology clinic was experiencing enrollment concerns for a study that had been active for five months and involved a particular segment of breast cancer patients for whom treatment options had failed. The center was falling behind its accrual goals and simply could not find the proper patients to provide this treatment trial option to. A tech-enabled solutions provider partnered with the center to reveal patients who met study eligibility criteria and were already seeking treatment within that practice’s physician network but who were not previously identified using traditional screening methods.

The AI-enhanced platform ingested and analyzed clinically relevant data for local patients, including unstructured clinical data within the electronic medical record, and combined this (EMR), deeper clinical data with the eligibility criteria to streamline the patient-matching processes. Over the next three months, the site identified 10 eligible patients to the study, increasing their enrollment rate over 200%. This increase was recognized despite the trial being conducted during the initial months of the COVID-19 pandemic, when many studies were closing due to slow or lack of accrual. As the tool was adopted throughout the site, the success continued, with the additive machine learning identifying an additional 20 participants enrolled over the following year. Overall, the site exceeded patient enrollment goals while providing this study as a treatment option to additional breast cancer patients.

Case study 3: Clinical research as a rescue operation

Even with the most meticulous methodology, unexpected barriers can arise during the conduct of a clinical research study. On average, the cost of a trial is around $20-120 million, with an average cost of approximately $50,000 per patient.5 These figures highlight the importance of using data and technology to run trials more efficiently, as it could help mitigate the risk of failed trials and minimize costs.

The final case study examines a healthcare center in the US Midwest with 50 physicians in 12 practice locations, including a rapidly growing clinical research program with a focus on rare patient populations. Thus, they are a trusted clinical research site by sponsors who invest their R&D dollars in pursuing interventions for traditionally untapped diseases. The center opened these trials in earnest hope of making study interventions available to their patients with limited treatment options. The sponsor, having already sunk significant administrative expenses, was not willing to continue to fund the non-performing trials at a growing cost. In this case, despite the depth of therapeutic expertise, the patient population for two studies proved to be elusive. Hence, a technology solutions provider was sought as a salvage attempt to rescue the clinical trials from imminent closure.

By applying AI algorithms, site staff were able to identify patients with the appropriate diagnosis who were not yet at a point during their therapy to meet the restrictive inclusion criteria. Through the platform applications, the patients were then placed on a “watch list” while receiving first line treatments. As patients moved through the expected course of this disease, including progression on those first line options, they became eligible for the trial. Once clinically relevant data confirmed that the available treatments were failing, the center was able to refer patients to the waiting investigational studies. This had a two-fold benefit of bringing therapy access to patients and further treatment options available to clinicians. The two rare studies in question each enrolled two subjects who would have had no further treatment options. Additionally, by salvaging the continuation of these two commercially sponsored studies, the center’s credibility as an enroller remained strong, enabling a strong relationship with the sponsor as well as revenue streams to the center by meeting the terms of the clinical trial agreements/budget. This maintenance of the bottom line also permits reinvestment in other rare disease treatments and studies for the longer term.

Future focus

These case studies are a small segment of the myriad actual achievements of clinical research centers who are utilizing new technology to support their operations. Currently, interventional and regulatory decisions are largely based on study and observational RWD. The mission to rapidly bring novel interventions to market and, ultimately to the patient, is aided by new methods that employ clinical data from real-time RWE electronic medical records for inclusion criteria in conjunction with ML and AI technologies. These same tools can also support specific site selection, in addition to pre-identifying highly viable participants for study offerings.

RWD-based study simulation has recently been adopted as a rational way to inspire clinical trial models, informing the development of inclusion and exclusion criteria and estimation of endpoint evaluations.6 Study control arm simulation via ML takes the use of RWD a step further by possibly reducing the necessity of placebo or control arm randomization. The emergence and impact of real-world application of technology in biopharma and academia has led to the development of massive data lakes via aggregation of relevant information from EMRs and other clinical databases, which has created an environment for exploration and innovation by the application of AI and ML application to analyze and interpret purposeful patterns of health and disease.4 These lakes also provide the capability to observe patient longitudinal data sets. Use of real-world and longitudinal data via EMR could eliminate and reduce the need for many conventionally conducted phase IV trials altogether, instead providing an FDA approved agent to the patient and simply following their response in real-time.

Additionally, novel adaptive, basket and umbrella trial designs are evolving, all of which utilize emerging AI and ML capabilities. These trials identify participants based on NGS or -omics profiling, consume RWD, and select studies involving agents matching implied profile pathologic results. There are opportunities to utilize AI algorithms for patient identification of previously unidentified patients or to predict populations which would most benefit from trials.ML analysis of aggregated, clinically relevant data sets can be further deployed to support source data verification and/or to provide safety data monitoring, resulting in a reduction of costly on-site study monitor visits. There’s also an emerging focus on advancement and expansion of clinical decision support applications which supply schematics that seek to optimize therapeutic interventions. Eventually EMR-to-EDC evolution may permit an uninterrupted assessment validation loop.

Betsy Wagner, director, data analytics at IQVIA


  1. Cognizant Case Study: Life Sciences.Roche cuts feasibility process by 36%
  2. Johns Hopkins Bloomberg School of Public Health. Cost of Clinical Trials For New Drug FDA Approval Are Fraction of Total Tab. (2018).
  3. FDA. Framework for FDA’s Real-World Evidence Program. (2018).
  4. Herrett, E. et al. Data resource profile: clinical practice research datalink (CPRD). Int J. Epidemiol. 44, 827–836 (2015).
  5. Applied Clinical Trials. New Research Emerges to Challenge Steep Costs of Clinical Trials. (2020).
  6. FDA. Framework For FDA’S Real-World Evidence Program (2018).