Surmounting eClinical Data Volume and Diversity


Applied Clinical Trials

Applied Clinical TrialsApplied Clinical Trials-03-01-2018
Volume 27
Issue 3

Amid industry feedback that the growing volume and diversity of eClinical data collected for studies is taxing cycle times, two studies highlight the need to optimize protocol design and executional complexity to overcome these data management burdens.

Pair of studies spotlight the critical need to optimize protocol design and executional complexity


The adverse impact of rising clinical trial complexity is manifest in inefficiencies and poorer performance observed across multiple

Ken Getz

scientific and operating functions supporting drug development activity. Our latest research at the Tufts Center for the Study of Drug Development (Tufts CSDD) characterizes the impact of protocol design and executional complexity on clinical data management. 

The study findings are based on responses from 257 distinct companies-198 small, medium, and large pharmaceutical and biotechnology companies; and 59 contract research organizations (CROs)-and demonstrate strongly that the growing volume and diversity of data collected for a given clinical study is taxing cycle times from database build through to database lock.  

Study respondents also indicate that data volume and diversity is presenting integration, compatibility, loading, and interoperability challenges that must be overcome to optimize drug development performance.  Moreover, given their high and growing exposure to a range of sponsor study requirements, CROs are delivering clinical data management speed advantages that hold opportunities and management insights.

Managing volume and diversity 

The typical Phase III protocol now collects more than one million data points, double the level observed 10 years ago. And that data is coming from a far more diverse collection of applications, including electronic clinical and patient-reported outcomes assessments, wearable and mobile devices, electronic health and medical records, social media, and-yes-paper.

On average, companies report using six unique applications to support each clinical study. All study respondents report using electronic data capture (EDC) applications in clinical trials. Approximately three-quarters report using applications to manage randomization and trial supply management, safety and pharmacovigilance, and electronic trial master file data. One-out-of-four (26%) sponsors and 52% of CROs report that they still use paper case report forms (CRFs) to collect clinical study data. Higher use of paper among CROs likely reflects the diversity of client company sophistication and intra-company system incompatibility.

Disparities are also observed between sponsors and CROs in the use of electronic source data capture applications. One-third (32%) of CRO companies report using eSource compared with only 14% of pharma and biotech companies. Sponsor companies report higher usage of electronic master file (72% compared with 64% of CROs) and safety/pharmacovigilance (75% vs. 63%) applications. 

Sponsors and CROs are using their primary EDC application to capture traditional, but not newer, data types. Integration challenges rise as the diversity of data grows, and data is increasingly captured and managed by multiple applications.

All sponsors and CROs report managing eCRF data in their primary EDC, with eCRF data representing more than three-quarters (78%) of the information managed by that application. Only one out of five sponsors and CROs report managing electronic clinical outcomes assessment (eCOA) and medical imaging data in their primary EDC. Less than one in 10 (9.7%) report collecting mobile health and genomic data, but virtually none of that data are captured in the primary EDC.

Quantifying data management burden 

Contrary to commonly-held notions, and in defiance of myriad practices and solutions implemented over the past two decades, current data management cycle times are longer today. Tufts CSDD found that the cycle time from last patient last visit (LPLV) to database lock was an average of 36.1 days in 2017, up from 33.4 days in 2007. These longer cycles times are no doubt due in large part to the rapid growth in eClinical data volume and the diversity of data captured.  

Three cycles were assessed in this study: (1) The average time to build and release the study database; (2) the average time between a study volunteer’s visit and when that patient’s data was entered into the study database; and (3) the time from LPLV to database lock. CROs typically offer faster average durations across all three cycles with less variance, suggesting more consistent performance from study to study.  

For Phase II and III clinical trials, the average time to build the study database and to enter study volunteer data following that volunteer’s visit was 68 days (nearly 14 business weeks) and eight days (nearly two business weeks), respectively, with very wide variation observed between companies (>90% coefficient of variation). CROs report building and locking study databases 20 days faster and 11 days faster, respectively. In discussions about these results, many clinical research professionals report experiencing substantially longer cycle times than the averages that we captured.   

Causes and impact 

The top cited cause of database-build delays was protocol design changes, with nearly half (45%) of study respondents indicating so. Distant secondary causes for database-build delays included user-acceptance testing and database design functionality issues.  

Those companies citing protocol changes, on average, achieved LPLV-to-database lock five days faster than the overall average, indicating that protocol changes did not lead to downstream data management cycle time delays. Whereas, database design functionality was cited by only one out of six companies as a top cause for build delays; this cause was associated with an LPLV-to-database lock cycle time that was 39% longer than the overall average (i.e., 50 days compared to 36 days).

Facing challenges in building study databases, a very high percentage of sponsors and CROs (85%) report releasing the final study database after the clinical trial had already been initiated. Release of the study database after starting patient enrollment (first patient first visit, or FPFV) is associated with longer downstream data management cycle times, including time to enter data after patient visits and time from LPLV to database lock. Companies that reported always releasing the study database after FPFV experienced significantly longer data management cycle times (54 days) compared to those that reported never doing so (31 days).  

Longer cycle times may be due to lower investigative site personnel motivation, lower levels of study staff trust and confidence in the data management system, and ongoing database functionality issues.

The introduction of EDC more than two decades ago heralded the promise of significantly faster study close-out time frames.  This latest study shows that we are farther away-not closer-to realizing that promise.  Nearly 80% of companies now report facing technical challenges in loading the data into their EDC system, as well as problems stemming from the limitations of the system.  

The imperative to manage complexity 

The Tufts CSDD study characterizes the broad impact of scientific and operating complexity on clinical data management performance. A recent study conducted by Medidata Solutions (MDS) looked at data management cycle-time performance stratified by clinical study complexity primarily for large pharma and biotech companies. 

MDS found that the cycle time to design a study database for low complexity clinical trials took 14 weeks (98 days); medium complexity trials took 17 weeks (119 days); and high complexity clinical trials took 19 weeks (133 days, or 36% longer than the low complexity cohort). The cycle time from patient visit to data entry for low, medium, and high complexity studies was two, three, and four days, respectively. And MDS reported that the cycle time from LPLV to database lock was 48, 49, and 53 days, respectively, according to ascending complexity. 

The results of the Tufts CSDD and MDS studies demonstrate the critical need to optimize protocol design and executional complexity to improve drug development performance, overall, and the burdens encountered by clinical data management, specifically. Sponsors and CROs are using and evaluating numerous approaches and initiatives to simplify protocol designs and improve executional feasibility, including protocol authoring templates; protocol challenge and feasibility review committees; and professional advisory boards and protocol simulations. A large percentage of sponsor companies also now report using patient advisory boards to solicit input on a variety of factors, including the expected impact of protocol designs on participation convenience and burden. 

The results of the Tufts CSDD study should give pause to sponsors and CROs compelled to collect more data from diverse sources. The major challenges associated with data integration, coordination, accessibility, and compatibility must be confronted if companies hope to achieve their ambitious protocol demands and to leverage the value and promise of robust, predictive analytics and machine learning to support clinical development strategy and performance.


Ken Getz, MBA, is the Director of Sponsored Research at the Tufts CSDD and Chairman of CISCRP, both based in Boston, MA. email:

© 2024 MJH Life Sciences

All rights reserved.