Enabling Digital Transformation: Managing External Clinical Data Sources to Advance Drug Development


With an increasing amount of diverse data that must now be collected and analyzed, the industry is faced with increasingly complex studies that present new challenges in data management.

Data has truly become the currency in our everyday lives, and the volume and speed at which it is being collected, processed and exchanged is staggering. For instance, every minute, 188 million emails are sent, 4.49 million Google searches are conducted, and 4.5 million videos are watched on YouTube.1 This proliferation of information can also be seen in the world of clinical research, where the amount of clinical data per trial has increased by 183% in the last decade.2

The growing volume of clinical data illustrates one significant way in which the drug development landscape is evolving. As clinical trials are becoming more decentralized, the variety of data types and sources, including genomics, biomarkers, wearables and images, that are being used in drug development is also expanding. With the amount of diverse data that must now be collected and analyzed, the life sciences industry is faced with increasingly complex studies that present new challenges in clinical data management.

Study background

From October 4 to December 30, 2019, eClinical Solutions and the Tufts Center for the Study of Drug Development (Tufts CSDD) conducted a survey to understand the trends in clinical data management and assess how sponsors are responding to current challenges. 149 individuals in senior-level data management, data sciences, biostatistics, information technology and clinical research operations roles participated in the study. Of the respondents, 48% were one of several individuals within an organization responsible for clinical data, 33% were primarily responsible for clinical data and 19% were frequent users of clinical data systems. 

The majority of respondents came from pharmaceutical (41%) and biopharmaceutical (37%) sponsors, though CROs (10%) and independent contracts (6%) were also represented. The sizes of respondents’ organizations varied from small sponsors (30%) that had initiated an average of 2.9 trials yearly, medium sponsors (29%) that had initiated an average of 23.8 trials yearly and large sponsors (17%) that had initiated an average of 112.4 trials yearly.

Challenges in data management

Growth in data volume and diversity

Sponsors are collecting more data types than ever. Though non-CRF data is the most commonly used in clinical trials among 87% of respondents, a variety of other data types and sources are also being used. Nearly two thirds of sponsors are incorporating direct data capture, apps, devices and medical images as other data sources, while roughly one-third are using electronic health records (43.4%) and omics data (35.5%).

Clinical Data Source Incorporation

Over two-thirds of all sponsors are also using or piloting at least four data types in clinical trials.

Number of Data Sources Incorporated

Percentage Who are Using or Piloting Difference Sources of Data

Data sources contributing to significant delays

The growing number of external data sources is associated with longer database lock cycle times with more than half of the study respondents citing non-CRF data as a primary contributing factor. Since 2017, this has resulted in a 40% increase in time from Last Patient Last Visit (LPLV) to database lock. Specifically, large pharmaceutical companies using four or more sources are experiencing a three-week increase in Last Patient Last Visit (LPLV) to database lock.

Last Patient Last Visit (LPLV) to Database Lock Cycle Times

Labor intensive data management activities

Although the availability of richer data sets is essential for gaining deeper insights to advance drug development, sponsors are faced with challenges in all data management activities related to organizing and reviewing the immense volume of data. Among all participants in the Tufts-eClinical Solutions study, analyzing and consuming data was rated as the least challenging of these activities, though over fifty percent of respondents still noted this activity as somewhat or extremely difficult. Of the seven key data management tasks noted in the survey, participants highlighted the process of initiating relationships with data providers as the most time-consuming and labor intensive. This highlights the opportunity for improving vendor relationships as a crucial element in alleviating difficult data management duties.

Data management activity labor intensiveness

Unsophisticated tools to centralize and standardize data

Even though the variety and volume of data continues to evolve, making data management activities more difficult and time consuming, the tools used for data integration and analytics remain relatively unchanged. 75% of companies are still using SAS as the primary tool to convert, explore and analyze data.

Percentage of Respondents

Solving data management challenges

With the changing landscape of clinical research presenting challenges in how companies organize and review disparate data types from a multitude of sources, all participants in the Tufts-eClinical Solutions study recognized the need to find solutions. Potential areas for improvement that were highlighted included handling more external data sources, decreasing cycle times through streamlined processes, encouraging patient-centered data acquisition strategies and developing better opportunities for real-world data uses. Life sciences organizations identified several ways in which they are responding to evolving clinical data management requirements in these specific areas.

Establishing a defined data strategy

When an organization outlines a formal data strategy to determine how data will be collected, processed and analyzed, there are clear benefits to clinical research. The greatest impact is to database lock times. Database lock is accomplished 10 days faster with sponsors that have implemented a data strategy compared to those that have not. Since large companies are managing more data sources, they are also more likely to have implemented a data strategy, especially given that 53% of organizations that are using five to six data sources experience longer times to database lock.

Though there are significant benefits, only one-third of the survey respondents have implemented a defined data strategy. Almost half of the respondents were either still examining or planning a data management initiative in the next few months. This means that there is still a great opportunity for all life sciences companies to leverage the benefits of a formal data strategy in reducing trial delays.

Implementing new technologies

To keep pace with the growing number of disparate data, organizations are turning to more advanced tools. The survey findings show that clinical data hubs or repositories are being used more often than other systems, including EDC and SAS-based infrastructure. However, EDC is still being used most often for specifically integrating and organizing direct data capture. Tools like clinical data platforms with robust capabilities to help data managers import, transform and review data will become powerful assets as data sources continue to increase.

Sponsors using a data hub or lake also had more confidence in their analytics capabilities in areas associated with analytics dashboards, interactive visual exploration, the ability to publish and collaborate on analytic content, and augmented data discovery.

Enabling digital transformation

Integrating new technology into current systems is just one element in establishing mature analytical capabilities. The survey also revealed a strong association between these capabilities and the implementation of data strategies. Over 50% of respondents who have a defined data strategy noted that their analytical capabilities are more developed.

Another benefit to having definitive data strategies can be seen in the areas of artificial intelligence (AI) and data science competencies. Organizations that have implemented a data strategy are more mature in terms of researching or implementing AI, machine learning and deep learning capabilities. These companies are also more likely to be developing new data sciences functions or specific data sciences roles. In fact, 47.2% that have implemented a data strategy have also established a data sciences function.

Developing analytical and AI capabilities is crucial to preparing for the digital transformation that is sweeping across the drug development industry. Life sciences companies must now meet the increased demand for clinical data and real-time insights from a variety of distinct sources. By automating time-consuming and labor-intensive data tasks, sponsors are able to support greater access to and deeper insights into data to drive innovation and accelerate drug submissions and discoveries.

Benefits to defined data strategies

Respondents to the Tufts-eClinical solutions survey agreed that there are tangible benefits to laying the foundation for AI and data science through formal data strategies. 85% of all respondents rated these qualities as somewhat or very beneficial:

  • Ability to leverage clinical data assets
  • Increased visibility of data to stakeholders for faster decision making
  • Decreased cycle times through more collaboration and shared analytics
  • Cost-savings through resource optimization and reduced rework
  • Faster acquisition and benefits from new real-world data sources that contribute to development

Conclusion: a new era of clinical research

In this modern era of drug development, clinical data has become one of the most valuable assets for life sciences organizations in powering digital transformation. As the Tufts-eClinical Solutions study demonstrates, delays in clinical trials, measured through database lock cycle times and through manual efforts and time-consuming processes associated with assembling data sets for review and insights will continue to increase without the implementation of new strategies. The diversity and volume of clinical data is also growing exponentially, and the tools for managing these must improve to keep pace.

Industry leaders are recognizing that evolving their approach to the research process is critical. By adopting formal data strategies, implementing modern technologies like clinical data platforms and expanding data science competencies, analytics and AI capabilities, sponsors can gain more control over the research process and drive actionable insights from new data streams to continue to push and accelerate drug development.

Raj Indupuri is the CEO; Sheila Rocchio is the Chief Marketing Officer of eClinical Solutions. Ken Getz is the Deputy Director and Professor; Beth Harper is a Senior Consultant; Michael Wilkinson is a Project Manager all of Tufts CSDD


  1. Domo, Data Never Sleeps 7.0, 2019
  2. Getz, K, Anticipating the Impact of the Patient Engagement Movement on Clinical Operations, Presentation at CROWN conference, Slide #9, January 202
Related Content
© 2024 MJH Life Sciences

All rights reserved.