Electronic Diaries: Source Data Out in the Open


Applied Clinical Trials

Applied Clinical Trials SupplementsSupplements-02-02-2004
Volume 0
Issue 0

Starting points for meeting eSource data regulatory requirements

During 2002, the Clinical Data Interchange Standards Consortium (CDISC) and CenterWatch, the Clinical Trials Listing Service, undertook a survey on the adoption of, and attitudes to, electronic data capture (EDC) within the biopharmaceutical industry. The survey revealed that one of the biggest factors slowing the uptake of EDC was the concern over how to interpret the various regulations and guidelines.

One particular area of uncertainty is the use of electronic patient diaries (ePDs) and how such systems meet the regulations regarding source data and source documents. While ePDs are in their relative infancy compared with electronic case report forms (eCRFs), their use within clinical trials is on the increase, and there has been some concern expressed about their compliance with the regulations.

The aim of this article is to examine the issue of ePDs and source data by reviewing the regulations and proposing a practical solution. The solution proposed is not an end in itself; it is there for illustrative purposes and to show that there are practical ways of solving the problem. This solution is not proposed as the best, nor is it the only possible solution. It is presented to provide a starting point for constructive discussion and debate on an issue that, in the opinion of the author, is slowing the acceptance of a technology that can aid and speed the drug development process.

Regulatory background

The requirements relating to source data are to be found within two Food and Drug Administration (FDA) documents, 21 CFR Parts 11 and 312,1,2 and also in the International Committee on Harmonization's (ICH) Good Clinical Practice (GCP) guidelines.3 In addition to the regulations, the FDA has issued the Guidance for Industry, Computerized Systems Used in Clinical Trials (CSUCT).4

Two important definitions are found within the ICH GCP document for source data and source documents:

Source Data-All information in original records and certified copies of original records of clinical findings, observations, or other activities in a clinical trial necessary for the reconstruction and evaluation of the trial. Source data are contained in source documents (either original records or certified copies).

Source Documents-Original documents, data, and records (e.g., hospital records, clinical and office charts, laboratory notes, memoranda, subjects' diaries or evaluation checklists, pharmacy dispensing records, recorded data from automated instruments, copies or transcriptions certified after verification as being accurate copies, microfiches, photographic negatives, microfilm or magnetic media, x-rays, subject files, and records kept at the pharmacy, at the laboratories, and at medico-technical departments involved in the clinical trial).

Figure 1. Local server with a disconnected ePD. This option relies on the device being "docked" at intervals to allow connection to the investigator's local PC or server and Figure 2. Central or trusted server with a disconnected ePD. This option, too, relies on the device being "docked" at some point to allow connection to the server.

When subjects use ePDs, they enter data directly into an electronic system; no paper is employed. It is here that we encounter eSource. eSource is defined in eClinical Trial's Planning and Implementation8 as:

"Source data (per ICH) captured initially into a permanent electronic record."

It is this initial storage in electronic form, rather than paper, that sits as the center of the eSource debate.

Practical considerations

Three storage locations for data entered by an investigator into an eCRF have been noted previously.5 These repositories are equally applicable to subject-reported data:

  • the data is stored locally on the investigator's own PC or a server on his or her own network-the local model;

  • the data is stored on servers operated by the sponsor or an Application Service Provider (ASP) acting on behalf of the sponsor-the central model; or

  • the data is stored on servers held by a Trusted Third Party (TTP)-the trusted model.

When considering the three scenarios from a practical rather than a regulatory perspective, the central model and the trusted model are essentially the same in that the servers are remote from both the investigator and the subject. It is worth noting that the two models were distinguished over concerns regarding the attributability of the data and over the ability to modify the data without the knowledge of the investigator.5

The combination of these scenarios and the myriad of technologies offered by the ePD vendors can result in a complex picture. To aid the discussion within this article, the ePD technologies have been placed into three categories:

Connected, where the subject has to be connected to the server for the entire session while data is entered. Examples of connected systems are a Web site that captures subject data or a telephone using an interactive voice response system (IVRS).

Semiconnected, where the data is stored on a device before being sent to the server at frequent intervals (daily or overnight). An example is a personal digital assistant (PDA) equipped with mobile data communications capability or a dial-up modem connecting over a telephone line.

Disconnected, where data is stored on a device with the device having no communication ability of its own. Data is only passed from the device to a server when a connection is made, generally at infrequent intervals (for example, on a visit to the investigator or at the end of the study). An example is a PDA that can only communicate when it is docked in a cradle connected directly to a PC.

Figure 3. Central or trusted system with semiconnected or connected ePD. Connection is via fixed line, mobile or wireless communications links (possibly utilizing the Internet) to the server and Figure 4. Logical data flow model.

Each of the three ePD technologies could in theory be combined and used in conjunction with one of the three server scenarios, resulting in nine possible deployment configurations.

It is desirable to remove some of the possible options so that the regulatory discussions can focus on systems that offer practical implementations. Experience with deploying ePD technologies suggests that two of the combinations-that of the local server model with a connected or semiconnected ePD system -are not practical. This opinion is based primarily on three concerns:

  • The need to provide a reliable data connection between the subject's device and the investigator's system

  • The need to install and maintain all of these systems

  • The need to validate all of the installations.

Four practical system types emerge: the local investigator system with a totally disconnected ePD, or a central or trusted server with a disconnected, semiconnected or connected ePD technology. Figures 1-3 and Table 1 provide pictorial illustrations of the four systems.

Regulatory issues

The requirements expressed in 21 CFR Parts 21 and 312, the ICH GCP guidelines, and the CSUCT guidance document combine to provide a statement of what constitutes source data and the requirements for its handling. A detailed list of the applicable regulations appears in an article by Raymond and Meyer7 while commentary on this topic appears in an article by Stokes and Paty.6

These requirements and guidelines can be defused into a number of broad categories (see Table 2) that allow for an assessment of the regulatory compliance of the systems to be made:

  • Definition-the description of what constitutes source data

  • Storage-the requirements for storing the source data

  • Retention-the requirements for the retention of the source data by an investigator

  • Confidentiality-the requirements to ensure that the confidentiality of the subject's data is maintained

  • Inspection-the requirements for the inspection of the source data by the regulatory authorities

  • Data Quality-the requirements for data to be attributable, legible, contemporaneous, original, and accurate: the ALCOA requirement.


The definition of source data and source documents provides clear direction that subject diary data is source data and that, given CSUCT III General Principles (D), the electronic record within the computerized system is the source document.

The question that arises centers on the definition of the computerized system: What system is being referred to? Is the source data the final database record of the whole trial data collection system or the initial record within the device used by the subject? The statement found in CSUCT III General Principles (D),

"When original observations are entered directly into a computerized system, the electronic record is the source document," would imply that the source document is the initial electronic record within the device.

For disconnected systems where there is no local storage of the data-a good example is an IVRS program-the answer is the centralized system. But for devices where data is stored prior to sending to the centralized system, maybe for a period of several weeks, the answer is less clear.


The storage requirements state that data is to be adequate and accurate, and stored in a manner that allows for accurate reporting interpretation and verification. These requirements raise two issues. First, 21 CFR Part 312 62(b) places requirements on the investigator to "maintain" case histories containing all relevant data.

TABLE 1. Server models with ePD categories.

"Case histories. An investigator is required to prepare and maintain adequate and accurate case histories that record all observations and other data pertinent to the investigation on each individual administered the investigational drug or employed as a control in the investigation..."

Secondly, the requirement stated in ICH GCP 4.9.4 places a responsibility on the investigator to take measures to "prevent accidental or premature destruction."

"The investigator/institution should maintain the trial documents as specified in Essential Documents for the Conduct of a Clinical Trial (see 8.) and as required by the applicable regulatory requirement(s). The investigator/institution should take measures to prevent accidental or premature destruction of these documents."

There have been questions raised in previous articles about the term "maintain" and whether or not remote storage of data allows an investigator to "maintain" it. This seems less problematical than the destruction requirement where, if data are stored remotely from the investigator site, it could be argued that the investigator is not really in a position to meet this obligation.


The regulations also state that records need to be retained for two years from the date the applicable marketing application is approved, or two years after the investigation was discontinued. During this time the records are to be protected to enable accurate retrieval. While the local and central models both allow for this, it is the word "retain" in conjunction with remote storage that has been hotly debated.

21 CFR Part 312 62(c) calls for records to be retained, while more specific requirements for the retention of source data are in ICH GCP section 4.9.4 and 8.3.13. To add slight confusion to the issue, the CSUCT document states that either original or certified copies are retained. These requirements, if a worst case approach is taken, lead to the conclusion that the investigator sites would need to retain the source data for two years. Depending on what is accepted as the definition of source data, this could mean that devices would need to be stored, a situation that is probably best avoided.

TABLE 2. Source data requirements.


As with any clinical activity, confidentiality is of great importance. The regulations require that records remain protected. None of the systems inherently prevent these requirements from being met.


All of the systems allow for an inspector to examine the electronic document and associated audit trails. These will be available either from the investigator's local system or the central system.

Data quality

None of the system architectures present any barriers to meeting the ALCOA (attributable, legible, contemporaneous, original, accurate) requirements. Each subject will have a means of entering identification information that allows data to be attributable to that subject. There are slight differences when the contemporaneous attribute is considered. Semiconnected and disconnected ePDs will rely on a local time from the device. Measures therefore need to be taken that this is accurate and unchangeable, whereas a connected ePD system will be able to use the time from the central server. None of the systems have any inherent features that prevent the legible, original or accurate requirements from being met.6


Sponsors wishing to deploy ePD technology are therefore faced with a number of issues dependent on the form of technology they choose to deploy:

  • Where precisely is the source data/document located?

  • By what mechanisms will an investigator maintain data during the trial?

  • How will that data be retained?

  • How will data deletion be prevented?

With a connected system, the source data is almost certainly that data stored on the central server. With semiconnected and disconnected systems, the device most likely holds the source data. It has been stated in the past that the location of source data is the first nonvolatile storage of the data, an argument that supports the case of the central database as the source. But the regulations do not make such statements; they are merely implied. Indeed, vendors are sensibly using nonvolatile storage on devices to overcome problems with battery life so as to ensure retention of data when devices are not kept charged. Employing such an argument would make the case for the device holding the true source data.

If the device is the location of the source data, then logically those devices would need to be stored for the two-year retention period by the investigator. This is obviously not an ideal solution and is something that all participants would wish to avoid. Not only would this be a logistical nightmare, but it would also increase trial costs. This situation can be alleviated by the use of nonvolatile memory cards, with the card itself retained rather than the whole device.

Further difficulties are encountered when we consider the investigator requirements for the maintenance, retention and nondestruction of data. If a local system is deployed at the investigator site with a disconnected diary system, then the requirements are closer to being met because the data is initially passed to the investigator. The investigator has control of the resulting records, but the source data resides on the device that, as detailed above, the investigator may be required to retain.

With semiconnected/connected diaries and central storage, the investigator never really maintains the data, nor can the investigator really prevent the data from being deleted. The data is obviously retained, but again, not really in the investigator's sphere of influence. It was with this in mind that resulted in the Trusted Third Party being proposed to act on behalf of the investigator. The TTP goes a long way to solve the problems in terms of maintaining the data and preventing destruction, but does not resolve the source document issue.


So what, in practical terms, can be done to meet the regulations? One solution stems from the words contained with the CSUCT document section III General Principles (F). The section states:

"Clinical investigators should retain either the original or a certified copy of all source documents sent to a sponsor or contract research organization, including query resolution correspondence."

It is the reference to the certified copy that is a potential practical solution. The crux of the ePD problem is that the investigator does not, unlike the situation with paper diaries, sit in the natural route of the electronic data entered by a subject as it flows from the subject's entry device to the target database.

With ePD solutions, data entered by the subject is either sent to the server immediately or retained upon the device to be sent later. In logical terms, what we ideally wish to achieve is to interject the investigator into the flow of the data. This would allow a copy of the data to be obtained prior to its arrival and storage on the central server-irrespective of whether the server is operated by the sponsor, CRO or trusted third party. The data held by the investigator would be data as entered by the subject, and any later changes could be detected (see Figure 4).

The CSUCT document defines certified copy as:

"... a copy of original information that has been verified, as indicated by dated signature, as an exact copy having all of the same attributes and information as the original."

The data that is copied would in effect be a certified copy. Consideration needs to be given to "as indicated by dated signature" since no individual would be involved-it would be technology undertaking the copying. However, this need could be met with audit trail entries rather than signatures. The technology would also need to ensure that the data was an exact copy.

This then allows the investigator to "retain," "maintain," and "prevent destruction," permits the source data requirements to be met, and allows for the retention of the data for the required periods. It is this placing of the investigator in the flow of the data that is the key factor; the investigator is back in control of the data in line with the spirit of the regulations while the technology provides the benefits of high-quality data with high subject compliance.

However, mapping this logical model to the realities of the real world is slightly more challenging. Where data is passed via the investigator system (disconnected ePDs) then the aims can be easily achieved by incorporating "data export" functionality into the ePD product.

For those systems where data is directed to central servers (semiconnected and connected), then a "gateway" would need to be inserted onto the central server. This gateway would need to intercept the data sent from the subject and pass it to the investigator. It would be eminently desirable if this were shown to be prior to any parsing of the data into the database, thus providing the assurance that the data has not been manipulated.

The issue with central servers is the means by which data is passed to the investigator given the diversity of investigator systems and data communications links with them. Also, given that diary data would be "drip-fed" to the servers, some form of consolidation would be required prior to data being sent to the investigators.

One possible option is that connected and semiconnected ePDs contain a "data export" facility that could be employed during visits by the subject to the investigator. The device could be docked and any data that had been collected and was held on the device streamed to the investigator's system. This would also be undertaken at the end of the subject's involvement in the trial. It should be noted that those systems that do not use a device, for example IVRS, would not be able to do this.


The solution presented within this article provides, at worst, a starting point for a discussion on meeting the regulatory requirements for source data using ePDs. At best, it provides a model for the development of ePDs that provides flexibility and confidence for sponsors and vendors, while delivering to investigators a simple means of receiving and storing subject-reported data.


1. Code of Federal Regulations, Title 21 CFR, Part 11: Electronic Records; Electronic Signatures; Final Rule, Federal Register, March 20, 1997.

2. Code of Federal Regulations, Title 21 CFR, Part 312: Investigational New Drug Application, Federal Register, April 1, 2002.

3. International Conference on Harmonization, Good Clinical Practice: Consolidated Guideline, Federal Register Vol .62, No. 90, 25711, May 9, 1997.

4. Food and Drug Administration. Guidance for Industry: Computerized Systems Used in Clinical Trials. FDA April 1999.

5. P. Bleicher, "eSource Redux," Applied Clinical Trials, August 2002, 30–31.

6. T. Stokes and J. Paty, "Electronic Diaries, Part 1. What Is a Subject Diary, and How Do Regulations Apply," Applied Clinical Trials, September 2002, 38–43.

7. S.A. Raymond and G.F. Meyer, "Interpretation of Regulatory Requirements by Technology Providers: The Case for Electronic Source Data," Applied Clinical Trials, June 2002, 50–58.

8. R. Daniels Kush et al., eClinical Trials. Planning & Implementation. 1st Ed. (CenterWatch 2003), ISBN 1-930624-28-X.

David Iberson-Hurst is founder of Assero Limited, The Old Post Office, High Street, Buckland Dinham, Frome, Somerset, BA11 2QY, United Kingdom, +44 (0)7989 603793, fax +44 (0)1373 453823, email: dave.iberson-hurst@assero.co.uk.

Related Content
© 2024 MJH Life Sciences

All rights reserved.