eSource Records in Clinical Research

Figure 1: The Problem


There has been a recent thrust within the pharmaceutical industry to promote the transformation of the clinical trial process. In 2008, the Clinical Trial Transformational Initiative (CTTI), a public-private partnership with the FDA, and Duke University, was established to identify practices that, through broad adoption, will increase the quality and efficiency of clinical trials. A major publication by CTTI of monitoring practices has helped to transform the monitoring of clinical trials.1

According to the Food and Drug Administration (FDA) eSource Guidance of 2013: “Electronic source data are data initially recorded in electronic format. They can include information in original records and certified copies of original records of clinical findings, observations, or other activities captured prior to or during a clinical investigation used for reconstructing and evaluating the investigation.”2 Saying it differently, eSource data are original subject data that are collected digitally without having to record the data on a piece of paper first, and then transcribe it to an electronic data capture (EDC) system.

It has been shown by many that changes made to the study database as a result of source document verification (SDV) affect neither study results nor the interpretation of those results.3,4,5 Therefore, despite the support of regulators, including the FDA2,6 and the European Medicines Agency (EMA)7, it is difficult to explain 1) why many of those managing clinical research operations they still respond to a perceived risk, instead of adopting systematic and scientifically substantiated approaches for data management and cleaning, and 2) why there has been tepid adoption of eSource solutions coupled with “intelligent” monitoring of clinical trials. Perhaps failure to appreciate the wisdom contained in the quotes below explains why those doing clinical research spend a lot of their time on non-productive activities.

  • Confucius: “Life is really simple, but we insist on making it complicated”
  • Leonardo da Vinci: “Simplicity is the ultimate sophistication”
  • William James: “The art of being wise is the art of knowing what to overlook”

There is broad agreement across the pharmaceutical industry that it no longer makes sense to collect clinical trial data by writing results and observations first on a piece of paper, then to transcribe those data into an EDC system, and then to save that paper so that a clinical research associate (CRA) can check how precisely a person transcribed information from a piece of paper into the EDC system. Sadly, we continue to see residual, frequent emphasis on checking these transcription activities when electronic systems can easily eliminate this highly inefficient process. Doing the math, moving to electronic systems can provide a dramatic return on investment by 1) reducing labor costs at the clinical research sites and sponsoring companies, 2) reducing travel costs, and 3) providing improvements in data quality.

In 2006, the Electronic Source Data Interchange (eSDI) Group within CDISC issued a paper entitled “Leveraging the CDISC Standards to Facilitate the use of Electronic Source Data within Clinical Trials.”8 The eSDI document was the product of the CDISC eSDI Initiative, the purpose of which was “to investigate the use of electronic technology in the context of existing regulations for the collection of eSource data (including that from eDiaries, EHR, EDC) in clinical trials for regulatory submission by leveraging the power of the CDISC standards, in particular the Operational Data Model (ODM)." The overall goal of the initiative was to make it easier to collect data once in an industry standard format.

Eight years later, on Jan. 29, 2014, FDA addressed these issues in a webinar reviewing the final guidance for industry titled “Electronic Source Data in Clinical Investigations."6 Presenters included Leonard V. Sacks, associate director, Office of Medical Policy (CDER), Ron Fitzmartin, Office of Strategic Programs (CBER), and Jonathan S. Helfgott, associate director for risk science (acting), Office of Scientific Investigations (CDER).  Clearly, FDA is supporting efforts by industry to adopt eSource solutions.

Even earlier, however, in 2010, the EMA GCP Inspectors Working Group eSource subteam released the “EMA Reflection Paper On Expectations for Electronic Source Data and Data Transcribed to Electronic Data Collection Tools in Clinical Trials.”7 The reflection paper—which is very similar in nature to an FDA guidance—was followed up by direct exchanges with industry representatives (the Society for Clinical Data Management and the eClinical Forum, among others). Those discussions are captured in a series of questions & answers that are frequently updated on the EMA website.  

The requirement by EMA of ICH E69 of an independent contemporaneous investigator copy (ICIC) of the case report form (CRF) has led to numerous and passionate debates among the professional and industry associations. The basic concept of the ICIC is still far from being consistently understood and implemented in the different geographies; regardless, ICIC has had a considerable impact and has brought about an extension of the concept of ALCOA, that source data should not only be attributable; legible; contemporaneous; original; and accurate, but also complete; consistent; enduring and available, when needed. As a result of this lack of consensus of the regulatory understanding of how to optimally generate and review electronic source records, apprehension and uncertainty among solutions vendors, contract research operatives, and pharmaceutical company sponsor remains.


Direct access to source data

Section 6.10 of ICH E69 Good Clinical practice states that “The sponsor should ensure that it is specified in the protocol or other written agreement that the investigator(s)/institution(s) will permit trial-related monitoring, audits, IRB/IEC review, and regulatory inspection(s) by providing direct access to source data/documents.” There are no statements that paper records must be kept, just that there needs to be direct access to source data/documents.

The FDA, in September 2013, issued a final “Guidance for Industry: Electronic Source Data in Clinical Investigations.”2 This guidance is consistent with the EMA “Reflection Paper on Expectations for Electronic Source Data and Data Transcribed to Electronic Data Collection Tools in Clinical Trials.”7

Regulators seem ahead of the game in supporting the transformation of how the pharmaceutical industry manages the performance of clinical trials. The following summarizes aspects of the perspectives of FDA and EMA.


FDA perspective

According to the FDA Guidance, source data include all findings, observations, or other activities in original and certified copies of original records, which are used by regulators to reconstruct and evaluate a clinical trial. All interested parties need to review source data and ensure adequate protection of human clinical trial subjects and the quality and integrity of the clinical data. Source data should be ALCOA and must meet the regulatory requirements for recordkeeping. FDA has proposed that electronically capturing and transmitting source data to the eCRF should:

  1. Encourage entering source data during a subject’s visit, where appropriate
  2. Eliminate unnecessary duplication of data entry and transcription errors
  3. Facilitate remote monitoring of data and promote real-time access for data review
  4. Facilitate the collection of accurate and complete data

According to the FDA, data can be entered into the eCRF either manually or electronically using “direct entry of data into the eCRF.”  For example, there are many data elements in a clinical investigation, such as blood pressure, that can be entered directly into the eCRF at the time of the office visit. FDA also indicated that “data elements originating in an electronic medical record (EMR) can be automatically transmitted directly into the eCRF.  Unlike a direct transmission to an eCRF from instruments or medical devices, EMRs can use intervening processes (e.g., algorithms for the selection of the appropriate data elements).” In cases where the EMR serves as the source, the pertinent data for the subjects in the clinical study should be made available for review during regulatory inspections and monitoring visits.

And although many may know this, the FDA added that the eCRF should be capable of recording metadata (e.g., data about the data identifying who entered or generated the data and when the data were entered or generated).  Also, changes to the data must not obscure the original entry, and must record who made the change, when, and why. Data element identifiers should be attached to each data element as it is entered or transmitted by the originator into the eCRF. Data element identifiers for each subject should identify the:

  1. Originators of the data element (including those entered manually (e.g., by the clinical investigator(s)) or automatically by devices
  2. Date and time that the data element was entered into the eCRF

These data element identifiers allow sponsors, FDA, and other authorized parties to examine the audit trail of the eCRF data and to provide information that will allow FDA to reconstruct and evaluate the clinical investigation.


EMA perspective

The following are from the EMA Reflection Paper7:

In order to meet the requirements of maintenance of an independent contemporaneous copy of source records, a certified copy of the data should be created before the transfer to the sponsor and retained at the investigator site. The method of transfer should be validated.

Contemporaneous means: “The recording of a clinical observation is made at the same time as when the observation occurred. If this is not possible the chronology of events should be recorded. An acceptable amount of delay should be defined and justified prior to trial recruitment.

The source data and their respective capture methods should be clearly defined prior to trial recruitment (i.e., in the protocol or study-specific source data agreement). The sponsor should describe which data will be transferred, the origin and destination of the data, the parties with access to the transferred data, the timing of the transfer, and any actions that may be triggered by real-time review of those data. There should only be one source defined at any time for any data element.

According to the EMA, the source data and their respective capture methods should be clearly defined prior to trial recruitment (i.e. in the protocol or study specific source data agreement). The sponsor should describe which data will be transferred, the origin and destination of the data, the parties with access to the transferred data, the timing of the transfer, and any actions that may be triggered by real-time review of those data. There should only be one source defined at any time for any data element. The EMA has focused on the issue of source data control and specifically stated that “the investigator should maintain the original source document or a certified copy (Requirement 5, ICH GCP 2.11, 5.15.1); that source data should only be modified with the knowledge or approval of the investigator (Requirement 6, ICH GCP 4.9.3, 4.9.4 and chapter 8); and the sponsor should not have exclusive control of a source document. (Requirement 10, ICH GCP 8.3.13).”

The EMA also requires that “all data generated in a clinical trial relevant to patient care must be made available to the investigator at all times during and after the trial, and all data held by the sponsor that has been generated in a clinical trial should be verifiable to a copy not held by the sponsor.”  The EMA also stated that “the requirements above are not met if data are captured in an electronic system and the data are stored on a central server under the sole control of the sponsor.” The EMA requires that a “contemporaneous certified copy of the data should be retained at the investigator site in addition to the record maintained on a central server.”

In addition, the business requirement of any electronic system can be summarized to include:

  • To be a separate legal entity from the sponsor and from the investigator, with a detailed contract defining duties and responsibilities of each party
  • An assured mechanism for investigators to be informed about and/or approves of modifications to the data should be clearly established, and this control by the investigator should be demonstrable 
  • Any changes to the data should be captured by the audit trail

Industry and CRO perspective 

The variations in regulatory understanding have confounded many stakeholders. In order to find a practical way forward, industry and professional associations should concentrate efforts on identifying and addressing concerns around clinical data quality. Specific provisions associated with quality assurance and the transparency of data flow should be present in the protocol, protocol-associated plans, and any contractual agreements with the vendors and study sites. Failure to create, implement, and document such risk mitigation associated with source data would be a major shortcoming, as it could:

  • Hinder the scientific interpretability of the data—and renders any clinical decision support impossible
  • Violate fundamental patient rights

Not surprising, the EMA has directly warned sponsors and contract research organizations (CROs) that documentation, procedural, and systemic shortcomings will compromise (or even block) application package submission to the Committee for Medicinal Products for Human Use (CHMP). Similar reactions in presence of blatant lack of compliance are to be expected in the case of NDAs (CDER), BLAs (CBER) and PMAs (CDRH).



Source records during development process

Despite the philosophical and historical differences explained earlier, it is possible to implement a clinical data acquisition environment that meets all recognized regulatory expectations. Source records, for example, can be maintained in many formats, including but not limited to paper records (e.g., data files, XML files, and PDF files). Regardless of the medium or the format, compliant maintenance of source records:

  • Assures the ability of independent auditors to confirm the accuracy of data presented to regulatory agencies by pharmaceutical and device companies
  • Allows for the reconstruction of study data in the event of catastrophic loss of data by the sponsor

Source data should not be the same “version” of the data that the sponsor (or CRO) “controls.” Confirmation of the accuracy of the data submitted to regulatory agencies can be accomplished by review of the validation of electronic systems used as original data sources, manually reviewing paper records and comparing results submitted by sponsoring companies and/or performing “data compare” of electronic source records to data stored in locked databases. Ideally, evaluating extracts of, or entire datasets against CRF data elements could provide corroborating evidence, and ultimately a measure of confidence, as to the accuracy of reported assessments and endpoints. This, however, would require access to the electronic source data from EMRs, machines used in the clinical trial process (e.g., raw MRI records) and EDC databases prior to transfer to the sponsor of the clinical trial. While all of these tasks can be used to assure precise data (i.e., data elements agree between databases), these processes do not holistically address data accuracy. Sensitivity analyses done on real clinical trial datasets have repeatedly shown that database changes, as a result of SDV and SDR, have little or no impact on final means and standard deviations (SD).3,4

Therefore, key areas of focus during monitoring, audits or inspections are the following:

  1. Were the clinical site personnel qualified, as documented by education and experience, to perform the activities for which they were responsible?
  2. Were the clinical site personnel properly trained on the protocol and the electronic systems used for the trial (e.g., EDC, eSource storage, etc.)?
  3. Did the clinical site understand and follow the protocol?
  4. Were there any differences in outcomes between sites, suggesting lack of understanding of the protocol or possible fraud?
  5. Were basic regulations followed that assure patient safety (e.g., proper informed consent and safety monitoring)?

The above is certainly not exhaustive—but it is an example of the strict minimums that assure quality.


Formats of source records

Sites may maintain source records in many formats, including electronic data stored by an independent third party prior to the source data arriving at the EDC database. 

Source records could come from direct data entry at the time of the study visit, from the EMR, from patient reported outcomes (PROs), and from output from machines. The sponsor should have no more access to electronic source records than if given access to a site’s paper source records. One advantage of the source record existing as a PDF file in the cloud, with read-only system privileges, is that it is impossible for registered users to manipulate the file. It is the same as viewing a document on the FDA or EMA website, where one can view but cannot modify the file. PDF files can also be uploaded into most EMR systems and searched if saved as “readable” files. Generating electronic source as .XML can further enhance the ability to search.

One way to create an ICIC of the site’s submitted study data is to adopt a data flow such as that depicted in Figure 2. In this model, at the time of the subject visit, rather than create a paper record of a study-related observation—if they have the means—the site staff can enter the information directly into a properly configured EDC system. As depicted in the figure, the system then takes a snapshot of the entered information (e.g., pdf, xml), transmits the snapshot to a repository controlled by the site (step 3), receives acknowledgment that the data were received without error (step 4), and then transmits the data to the EDC database (step 5). This approach is consistent with regulatory concerns in that sites at all times maintain control over original data. This process also reduces the site’s work burden by eliminating maintenance of paper copies and subsequent transcription to EDC systems, and provides sponsors with real-time data with which to make timely decisions. In today’s world of cloud-based computing, there is a growing recognition of the benefits of capturing source records using databases and web applications, vs. local data storage devices.


Figure 2. eSource Data Flow

Maintaining/hosting source records

There is no legislation or regulatory guidance that prohibits the investigator from delegating the hosting and maintenance of the site’s ICIC of original date via contractual agreement to an outside vendor or CRO. This applies even if the system is set up and financed by the sponsor. At the same time, it is the responsibility of the sponsor to have control of the study database and the data management process. This task is often delegated to a CRO, via a written agreement. These two tasks can very well be delegated to two independent CROs. However, when the same CRO takes on both tasks in the same study, independent systems must be in place to control access by employees of the CRO to the data management and eSource software products. If the appropriate controls of systems are not in place, this could indicate a conflict of interest where the independence of the CRO may be questioned.

A key question is whether it is reasonable for the EMA, FDA, and other regulatory bodies to view independent CROs performing EDC and concurrent data management services as independent of sponsors. Clearly, no one other than the clinical site should enter or modify data and all transactions must be subject to an audit trail. So the fact that a CRO is performing data management should not disqualify it from hosting eSource data as long as:

  1. The protocol defines that eSource records are being used
  2. The eSource records are independent of the study database to be transferred to the sponsor
  3. There are agreements with the study sites to allow the CRO to host the eSource records generated by EDC systems and other sources such as electronic patient reported outcomes (ePROs)
  4. There are proper controls for access to the eSource records
  5. If there is a need to perform a software fix, it must be performed under formal change control procedures
  6. There is clear segregation of roles and responsibilities, so that the same person or functional area are not delegates on behalf of both the site(s) and the sponsor
  7. It is all clearly documented within policies and standard operating procedures (SOPs) that the operations associated with the hosting of eSource records employs independent operatives/staff members from the study conduct teams

To comply with regulatory requirements and good business practices, a detailed contract should be in place defining the duties and responsibilities of the service provider, enabling the sponsor and/or investigator to transfer some of their tasks, but to retain control of their responsibilities. The contracts/agreements with the sponsor and with the investigator need to be clear as to the role of the service provider and to what extent this relates to the specific responsibilities of the investigator and those of the sponsor. The method used to ensure that the investigator is informed about, and/or approves of, modifications to the data should be clearly established, and this control by the investigator should be clearly demonstrable. 

Companies providing EDC and data management services should be viewed, in our opinion, not as sponsors but just like a central lab that hosts source data. In terms of hosting the eSource data, the contractual, procedural, and physical provisions should be instituted by the investigator or at least on behalf of the investigator, assuring that the sponsor cannot obtain control over the site’s source records. As a result, the following rules should be considered when assessing the potential hosting of eSource data by an independent third party:


  1. The specific rules concerning control of eSource records should be clearly stated in a contract with the study site and also cited in the protocol
  2. The eSource record, including the complete audit trail of changes, should reside in the independent third-party database prior to, or at least concurrently with if technically feasible, the data element entering the study database.
  3. Access to eSource data should be limited to “view only.”
  4. The eSource database should not be involved in any data modification activities on behalf the sponsor, to be considered sufficiently independent. 
  5. Any contractual agreements should be made on behalf of the investigator, explicitly stating that the sponsor cannot obtain control over the site’s source records.




Impact on EDC users

The stakeholders and their respective roles and responsibilities in conducting clinical research are well understood. Industry recognizes three essential stakeholders: 1) sponsors, 2) clinical site, and 3) third-party service providers (see Figure 3).

Figure 3: Roles and Responsibilities When Using eSource Processes

The good news is that adopting an eSource methodology, such as that described in Figure 2, imposes no substantive changes in roles and responsibilities vs. EDC systems, as outlined in Figure 3:

Stakeholders continue to adhere to good clinical practices (GCP) and as part of study setup, there now must be agreements (step 1) describing how all study information will be shared among investigational sites and third parties (steps 2, 3).  

As with any EDC study, appropriate documents must be on file, and testing results and sign-off must be in place prior to data collection (steps 4-6). During study conduct, the clinical trial database (step 7) is configured to report only the data specified in the protocol. Once a patient consents to participate in the trial, trained site staff can then begin to collect patient data (steps 7-11). 

As is current practice, these data can be transmitted to either the EDC server at a CRO or other service provider (step 12). Alternatively, eSource data can be transmitted to a dedicated eSource Storage System (step 13). 

Site staff, regulators. and CRAs can then access the eSource system when permission is granted by the study site, which is the current practice when there are paper source records.

The fundamental impact of adopting eSource solutions has to do with raising awareness of the role of the clinical sites in managing access to their electronic study records vs. the paper records that sites have historically generated and maintained.


Discussion and conclusion

Transformative innovations often 1) require new competencies, 2) disrupt existing organizational competencies, 3) require a change in business models, and 4) result in social and systemic change. While these innovations also lead to widespread productivity enhancements, they also challenge the status quo and often lead to resistance to change.

Just digitizing a paper process can be viewed as an incremental innovation. Organizations, particularly large and successful ones, have a great temptation to view modifying and building on existing processes, and improving on existing skill-sets, preferable to implementing transformational changes. Substantive change is always difficult. The larger the organization, the more difficult it becomes to incorporate such changes. To bravely undertake a quantum performance leap requires a fundamental rethinking of the issues and a willingness to make the kinds of changes required to take advantage of the benefits inherent in newly available technology. Such is the case with the transition from paper to electronic records.

Paper-based record keeping in clinical trials has a long and storied history and has served the community well for many years. It persists, in part, due to the interrelated factors of risk aversion, lack of knowledge, regulatory misperceptions, and perceived ambiguities. A variety of digital technologies, as well as increasingly clear guidance from Western regulatory agencies present the opportunity for change. This change may have considerable positive ramifications in terms of decreased burden on sites and sponsors, and ultimately the community at large, as a significant cost burden in the conduct of clinical trials may potentially be reduced or eliminated.

In this paper, we have demonstrated ways to satisfy regulatory concerns about data integrity and investigator controls over their source records, when incorporating 21st Century tools and processes as standard elements of clinical development operations. These solutions include 1) data entered directly into EDC systems at the time of the clinic visit, as well as 2) seamless integration with the EDC study database with data residing in the EMR, and ePRO systems. Integrating eSource solutions with the EDC database will result in the ability 1) to review data in real time, 2) for monitors and clinical sites to rapidly issue and respond to data queries, and thus improve quality, and 3) to reduce time and materials spending associated with data acquisition and review. 



Jules T. Mitchel* is President, Target Health Inc., email: [email protected]; Jonathan Helfgott is Director of Regulatory Affairs, Stage 2 Innovations; Tom Haag is Data Integrity Process Expert, eClinical Quality Assurance, Novartis Pharmaceuticals Corporation; Silvana Cappi is Executive Director, Global Biometrics, Ferring Pharmaceuticals A/S; Imogene McCanless is Senior Vice President, Biometrics and Regulatory Affairs, TransTech Pharma; Yong Joong Kim is Executive Director, Target Health Inc.; Joonhyuk Choi is Senior Director, Software Development, Target Health Inc.; Timothy Cho is Associate Director, Application Development, Target Health Inc.; and Dean Gittleman is Senior Director, Operations, Target Health Inc. 

* To whom all correspondence should be addressed.




  1. Morrison B, Cochran C, Giangrande J, et al. 2011, “Monitoring the Quality of Conduct of Clinical Trials: A Survey of Current Practices,” Clinical Trials, 8:342–349.
  2. FDA, September 2013. Guidance for Industry Electronic Source Documentation in Clinical Investigations.
  3. Sheetz N, Wilson B, Benedict J, Huffman E, 2014, “Evaluating Source Data Verification as a Quality Control Measure in Clinical Trials.” Therapeutic Innovation & Regulatory Science, 2014;48:671-680.
  4. Mitchel J, Kim YJ, Choi JH, et al. 2011, “Evaluation of Data Entry Errors and Data Changes to an Electronic Data Capture (EDC) Clinical Trial Database,” Drug Information Journal, 45:421-430.
  5. Smith CT, Stocken DD, Dunn J, et al. 2012, “The Value of Source Data Verification in a Cancer Clinical Trial. PLoS ONE,) 7(12):e51623.
  6. FDA Webinar on a Final Guidance for Industry Titled “Electronic Source Data in Clinical Investigations." ( ); (  
  7. EMA, 2010. Reflection Paper On Expectations for Electronic Source Data and Data Transcribed to Electronic Data Collection Tools in Clinical Trials (EMA/INS/GCP/454280/2010).
  8. CDISC, 2006. Leveraging the CDISC Standards to Facilitate the use of Electronic Source Data within Clinical Trials (
  9. Guidance for Industry, 1996. E6 Good Clinical Practice: Consolidated Guidance, ICH.


Minor Disagreement with concept of "eSource Storage System"

The article starts out great and I appreciate the fact that it points out philosophical differences between the FDA and EMA on saving .pdf versions of "source data".

I don't like that a subtle "requirement" is injected for the investigator to retain a .pdf version of the source data either in a file system they maintain (and therefore must validate to some extent - way beyond what normal controls might be in place at a typical investigator site) or contract with a trusted "eSource Storage" provider. We have enough extraneous systems, this concept just throws water on an oil fire (don't do it - the water turns to steam, carrying droplets of oil into the air - it is like an explosion).

Any chance that one or more of the authors represent a company that offers this unneeded resource?

By unneeded I mean - a .pdf copy of source data adds nothing to how the investigator controls the data they enter. Any record with even a minimal digital signature / hash / manifest and Machine Authentication will serve this purpose and if properly designed you will be able to easily identify attempts to hard delete record versions.

Separating the content authors from the database administrators is sufficient to prevent fraud and insure the provenance of source data. Creating additional copies of the same data in a different format is just not needed and serves no purpose. Unless you suspect the investigator is conspiring with the eSource or EDC DBA. In which case, creating .pdf versions of fake or altered data is the easiest aspect of their fraud.

Since the Sponsor is neither the DBA nor the content author the potential for sponsor fraud is eliminated by design. No need for .pdf copies of data the sponsor should not be able to edit without an audit trail. If the sponsor can make changes without an audit trail then the system is not 21CFR11 compliant.

It should be that simple. Unfortunately it is not.

lorem ipsum