Data Integration: Past & Future


Applied Clinical Trials

Applied Clinical TrialsApplied Clinical Trials-06-01-2007
Volume 0
Issue 0

Emerging technologies are moving the industry closer to true IT solutions harmonization.

Although the biopharmaceutical market now boasts a multitude of electronic clinical solutions to help simplify the clinical trials process, the issue of integrating the data collected through each of these solutions remains a major point of frustration. Individually, these solutions provide benefits for specific aspects of the clinical trial process. These systems often affect common processes and use overlapping information, yet are all generally used in isolation. As the number of clinical systems and the demand for efficiency increases, there is a growing need to share data. By cost effectively integrating technologies and automating processes, clinical trial sponsors can eliminate redundant tasks and accelerate the flow of critical information to key stakeholders.


In addition to daily costs associated with clinical trials operations, costs of delays in bringing blockbuster drugs to market, unnecessary steps in a trial process or pointless duplication of documentary data also contribute to the rising costs of clinical studies. A properly applied trial integration solution could reduce the impact of these costs on clinical trial budgets significantly. As previously established, currently the different electronic solutions used in a clinical trial all have common data points, leading to the same information having to be entered into different systems—with details such as patient demographics residing in several systems. For example, an investigative site coordinator may enter patient demographics using an Interactive Voice Response (IVR) database and then later enter the same information into an Electronic Data Capture (EDC) system. In addition, any data-entry errors or discrepancies between the two systems must be resolved when cleaning data, prior to locking the databases. Of course, this takes time and resources. An integrated solution would eliminate these redundant steps and sponsors could cut costs and save time—as well as reduce the risk of data entry errors.

Over the past few years, clinical trial sponsors have begun to recognize the importance of integration and have started to implement systems in order to facilitate this. However, as technology such as EDC, which is designed on a trial-by-trial basis, becomes more widely adopted, traditional point-to-point integration methods are rapidly becoming outdated. Instead, the pharmaceutical industry is looking to more flexible interchange platform systems, which are already widely used in other mainstream industries such as manufacturing, refining, and supply chain management.

This article will discuss the progress of integration over the past few years and what the future holds in terms of new integration technology.

Current integration challenges

Every clinical trial has some sort of data automation, integration, and presentation needs, very often with one having a significant impact on the other. A carefully considered approach to data, system, and process integration can benefit every stakeholder from the investigator to the pharmaceutical executive. Data integration can reveal new nuances for clinical studies, such as why so many patients fail screening and pose new questions about the dynamics of current workflows. More importantly, the right trial integration solution will mitigate risk, automate tasks, ensure consistency, and speed the trial timeline while reducing costs.

Most pharmaceutical companies currently rely on a familiar line-up of information technology solutions to shorten clinical trial timelines, manage data, and contain costs. Clinical Data Management Systems (CDMS), Clinical Trial Management Systems (CTMS), EDC, Drug Supply Management Systems (DSMS), and IVRS are all now commonplace in clinical trials. Widely touted logistical and commercial benefits have helped drive the adoption of such programs, and pharmaceutical companies have reaped benefits after implementation. However, these solutions would be more effective and efficient if they could work together—leaving many sponsors questioning what it would take to integrate the data, automate tasks, and aggregate data points so that each system could share information with the other.

Despite the vendor promise of solutions to provide this kind of enterprise-wide systems integration, most organizations continue to struggle with the integration challenge. Currently, data cleaning and integration still accounts for 25% to 33% of all systems implementation budgets.1

For years, drug and medical device developers have dropped information gathered by case report forms (CRFs) into data silos—database applications designed for only one trial. Their focus has been strictly on completing the specific trial's analysis so they can submit their new products for FDA approval as quickly as possible. There has been little concern for the future usefulness of the raw data.

A number of software vendors have developed so-called integration or interface engines that allow heterogeneous health information systems within a hospital or region to exchange information via standardized messages.2 Such integration engines provide a useful way of solving the basic communication problems between systems, but they do nothing to address true interoperability and integration of information. This approach has worked well and has been effective, but when the number of possible interactions between systems increases, such as what happens with shared care, then the limitations of scalability become apparent.

These are collectively known as integration technologies. By linking disparate applications within and across companies, integration technologies help firms realize the promised—yet often still unrealized—benefits of enterprise solutions, including increased productivity, efficiency, and customer satisfaction.

Point-to-point integration method

An enterprise using point-to-point for data integration and business automation will endure increased trial costs and timelines, more data synchronization errors, lower flexibility, and incomplete decision support ability. Point-to-point can, in most cases, create more challenges than it solves because:

  • It forces the two systems to be tightly coupled to each other, making it impossible to revise one without affecting the other or the custom point-to-point connection.

  • It requires a highly technical labor force to build, test, maintain, and support the custom connections.

  • It forces external vendors and partners to adapt their systems to the proprietary needs of the point-to-point connection when data is to be shared outside the CRO/pharmaceutical.

Point-to-point fails to meet most of the long-term strategic goals of enterprise application integration (see Figure 1).

Figure 1. The complexity and limitations of point-to-point systems may introduce more problems than solutions.

From the clinical trial integration example, a point-to-point approach would require the EDC system to export details of patient 4321-AAC to be randomized to a local file share; introduce changes to the IVR system to import the data file from the EDC system then export the patient randomization details as 014321; and change the ePRO application to import the screened patient details from the EDC system using Web services over the Internet.

Changes to all three systems may not be possible if they were purchased from separate vendors, and security between each point must be managed through custom code. Finally, point-to-point forces a synchronous workflow that relies on all three systems being available to send and receive data through a local file share and the Internet as well as require the EDC system to track the different synonyms representing the patient in each system. This solution is limited and does little to represent the overall clinical trial processes or shared data at the decision support level.

Obviously, these integration challenges are not unique to the life sciences industry and manifest themselves in any industry that must connect two or more applications for the purpose of automating processes or sharing data. As such, the high-tech industry has responded by providing solutions designed to solve the shortcomings of manual data entry and point-to-point approaches. One such solution promoted as an integration panacea is Web services.

Web services method

Web services and related technologies are the hottest, but least mature, set of integration technologies. Web services are standards, protocols, and directory services that enable Web-based clinical applications to share data among themselves regardless of their underlying programming languages or platforms.

Figure 2. Integration between applications both within and outside a CRO is easier with Web services, which use XML.

Web services are a tremendously promising technology because they enable easier integration between applications both within and outside the CRO or pharmaceutical. They also may act as separate mini-applications, enabling simple tasks, such as online randomization or site certification, over the Web. Web services are ideal for asynchronous applications with low transaction volumes. Some proponents of Web services argue that they will enable a new world of flexible, on-the-fly integration of multiple distributed applications. It is more likely that Web services will be largely used for integration purposes, as well as for supplementing core enterprise solution functions, such as a central lab load on an as-needed basis.

Web services use XML, which is an open standard for describing data over the Internet that enables applications within or between firms to exchange information with agreed-upon meaning without otherwise having to understand anything about one another.

Although XML was originally intended to enable easier information sharing between publishers over the Web, clinical trial integrators have incorporated XML into their products to enable easier internal integration as well. But as XML has been incorporated into many vendors' products, multiple XML standards have proliferated and multiple standards can impede easy integration if a single standard is not agreed upon. In our industry, the emergent standard is CDISC. In addition, XML currently lacks many industry-specific definitions, as well as the security, verification, and the confirmation function necessary for interenterprise communication. Also, XML can't achieve integration alone; it's just a piece of a comprehensive Web-based integration architecture.

While Web services introduces data sharing standards using XML, security models, and a common Application Program Interface (API) to link loosely coupled disparate systems, they still require integrators to modify existing IT applications and manage key data point synonyms. These technology standards allow trial applications to more easily share data; that's because less proprietary custom coding is used and a single transport mechanism is provided using a standard API to the application. From the same clinical trial integration example from before, a Web service approach would:

  • Require the EDC and IVR system to be modified to support a Web services application program interface.

  • Retrieve details of the screened patient 4321-AAC from the EDC Web service to be randomized in the IVR system.

  • Retrieve the randomization details of patient 014321 from the IVR system to be applied to patient 4321-AAC in EDC.

  • Change the ePRO application to import the screened patient details from the EDC system using Web services over the Internet.

Again, changes to all three systems may not be possible if they were purchased from separate vendors, but security and data standards are enforced by the Web services, eliminating the need for proprietary custom coding. Finally, Web services still rely on all three systems being available to send and receive data as well as require the EDC system to track the different synonyms representing the patient in each system. This solution also is limited and does little to represent the overall clinical trial processes or shared data at decision support level.

Clinical trial interchange platform

At its core, a clinical trial interchange platform (CTIP) allows two or more trial systems to communicate asynchronously by sending and receiving messages. In the same way that today's email systems allow communication between two or more people, messaging allows communication among two or more applications, without requiring human intervention.

One of the most fundamental aspects of messaging is its asynchronous nature—the sender of the message does not have to wait for the recipient to receive the information. Going back to the email analogy, once an email has been sent, the sender does not wait for a reply—he/she is free to continue working while waiting for the recipient to respond. Sending applications are free to generate messages at an appropriate speed, handling peak periods as they occur, without having to wait for recipients to deal with the requests. There are instances, however, where synchronous communication is required—for example, awaiting the result of a complex mathematical randomization calculation before making a decision. In these instances, it is essential that the interchange platform can handle this type of communication as well as the asynchronous form. An integration solution that uses messaging technologies requires little or no change to the underlying applications in order to guarantee communication among them. This decoupling of applications allows the different business units to keep operating through the clinical trial life cycle.

A CTIP can break down operational silos and deliver a holistic solution with a single view of all trial-related systems and application dependencies along with server configuration and security information. A portal would allow users to authenticate once and gain access to the resources of multiple software systems. Now the CRO or pharmaceutical can access, search, and organize all study-related data and resources in one central location that catalogs the synonyms used in each connected component. Users have a consistent mechanism to navigate and find relevant information. Default repository settings can be modified to add workflow, define retention policies, and add new import/export templates and resource types. Systems exchange messages through the platform allowing any system in any software version to trigger events and exchange data. The interchange platform automates the task of setting and enforcing regulatory and best practice standards and delivers a system-generated audit trail for regulatory compliance (see Figure 3).

Figure 3. A CTIP can facilitate communications between different applications, expediting the clinical trial process.

From the patient data-sharing example given earlier, a CTIP solution would:

  • Send a message to the IVR system to randomize patient 4321-AAC when an alerting message is sent from EDC that this patient was screened.

  • Allow the EDC system to continue screening patients. When the IVR system has randomized the patient, the platform catalogs the IVR patient data as 014321 and alerts the EDC system to store the randomization details against patient 4321-AAC.

  • Send a message to the eDiary system alerting it to add the new patient and then catalog the ePRO patient as 4321 in the master resource catalog.

Cross-trial integration

Recently, life science companies have started to realize the strategic benefits of integrating clinical data across all trials. When properly collated, this collection of raw data—now sitting in those aforementioned silos—becomes an asset worth billions in cost savings and additional revenue. However, formatting data across trials that have not been previously designed for integration can be an extremely labor-intensive task, with a mass of information needing to be rekeyed across a number of systems.

The biggest question in clinical trial integration at present is whether to build integration capabilities into the point-to-point or Web services methods, or build a more flexible CTIP solution. When trying to answer this question, the largest area to consider is cross-trial integration.

With the traditional point-to-point single trial integration systems, every time something changes (e.g., the version of a system such as EDC) it impacts the effectiveness of the integration by potentially upsetting and delaying a lot of the work that has already been completed. However, by taking a platform integration approach, it does not matter which version of a system or application is used, as it has no impact on the integration capabilities of this model. A middleware product built on to a trial as a platform enables data to be imported and exported regardless of its original location. This strategy means data only has one place to go, through the interchange platform, meaning study staff are able to see all activities at a glance.

A further integration challenge is presented when sponsors want to review multiple data across multiple versions. A CTIP system makes it easier to share data with different trials but makes the version of a system being used (such as EDC) irrelevant. However, building a CTIP solution can be a challenge.

Other industries, such as manufacturing, have already pioneered the notion of universal data models using a platform approach that expresses the interrelation of business components with their applications. They have established the pros and cons of this approach, and the pharmaceutical industry is in the enviable position of using their model as a template of what works and what does not. The pharmaceutical industry can easily achieve this model-type as, with few exceptions, every trial involves the same links between notions such as protocol, patient, treatment group, and visit. Recently, data modelers in the pharmaceutical industry have expressed a realization of this by publishing suggested schemas aimed at preventing needless variation in the basic designs between trials.


Although there is CDISC, there are no compulsory industry regulations regarding the standardization of integration or exchange of information. Instead, CDISC standards work as a recommended approach, leaving sponsors and vendors to decide whether they want to adopt them or use a different method. Presently, no generic approach to integration has been adopted by the pharmaceutical industry. Integration is based on one or more systems seeing what the other can accept and transferring data accordingly. However, as CDISC becomes more widely adopted and the industry gets to the point where APIs are published, integration will become much more routine. This will ultimately lead to an increase in the number of companies adopting an integrated data point and using a platform integration system to facilitate cross-trial integration.

Implementing a clinical trial on a CTIP can have a positive impact on the necessary FDA approval. Using a CTIP approach means that every event carried out throughout the entire trial will automatically be captured and stored. Without this feature, study staff would have to retrospectively go through and code each for the FDA audit—a very time-consuming and labor-intensive process.


Most pharmaceutical and biotech companies are already looking into the benefits of integrating clinical trial data—whether it be single or cross-trial—with approximately one-third already having some sort of experience. In fact, some companies are now mandating integrated solutions and will not proceed with a vendor without one. However, these companies need to now recognize the next stage in clinical trial integration—which is the ability to integrate data, automate processes, and report status in real time, not only across different systems but also across different trials.

A CTIP approach has to be the next phase in clinical trial integration if pharmaceutical and biotech companies want to move forward. The rewards of such a system are significant and include cost savings, reduced time to market, and improved R&D impact due to the ability to view data across all previous trials. It is also important to consider how implementing such a system can ultimately make products more competitive. For example, the ability to integrate data across trials can help sponsors research correlations between efficacy and product characteristics outside the narrow focus of one single trial, ultimately revealing new advantages to existing products.


1. J.G. Harris and S. Cantrell, "Enterprise Solutions Integration Technology," Next Generation Enterprise Solutions, Issue 3, March 2002.

2. F. Ferrara, W. Grimson, P.A. Sottile, "The Holistic Architectural Approach to Integrating the Healthcare Record in the Overall Information System," In Proceedings MIE "99, IOS Press, 847–852.

Solomon Shacter is director, software engineering, ClinPhone, University Research Park, 101 Academy, Suite 250, Irvine, CA 92617, email:

© 2024 MJH Life Sciences

All rights reserved.