The Transformation of Clinical Trials from Writing on Papyrus to the World of Technology

Applied Clinical TrialsApplied Clinical Trials-02-01-2022
Volume 31
Issue 1/2

New requirements must be put in place to ensure data quality and integrity.

The recent evolution within the pharmaceutical industry to the use of technologies associated with clinical trial data collection and storage is quite impressive. Even the risk-averse clinical trial sites are now comfortable departing from recording clinical trial data on pieces of paper to the use of technology-driven data collection solutions. As a result, new requirements must be put in place to ensure data quality and integrity (DQI). At the same time we must avoid adding new complexities to the clinical trial paradigm that attempt to solve problems that do not exist. We suggest referencing an article entitled "When Should the Audit Trail Begin? co-authored by the eClinical Forum, published in 2021, as a particularly good start.1 The article addresses issues such as evaluating software that has the capability to track keystrokes prior to hitting “submit” in an electronic data capture (EDC) system. The authors concluded that if companies or regulators would recommend this as a requirement, it would add additional workflow data for analysis without any proof of value to DQI, or the ability to address any perceived malfeasance. What all this means is that we have be cautious when instituting any new quality associated technology so as not to add requirements/features just because we can. The pharmaceutical industry and regulators must also not use paper-based processes as gold standards as they critically review the root reasons behind any new rules. Any new rules must be clearly documented why they are indeed necessary and how they add to DQI.

History of writing things down

The historical record of writing things down started around 3100 BCE when people started to write on clay tablets.2,3 Around 3000 BCE, the Egyptians begin writing and painting on papyrus, a material prepared from the stem of the water plant Cyperus papyrus. Ordinary leather in the form of parchment began to be used for writing around 2500 BCE. During the second century BCE, a better form of parchment was introduced such that both sides could be written on, but it was not until 400 years later that parchment rivaled papyrus. According to Chinese tradition, paper as we know it today was invented in the year 105 CE by Cai Lun, a eunuch who resided at the imperial court. Fragments of this paper product, made from rags and the fibers of mulberry, laurel and Chinese grass, are currently available in museums. Nevertheless, parchment was the standard writing surface for European scribes up and until the 15th century. Historically, for the Jewish bible, a Torah or biblical scroll could only be written on parchment from the skin of a kosher animal, and the five pages of the U.S. Constitution as well as the Declaration of Independence, the Bill of Rights, and the Articles of Confederation were all written on parchment.

History of clinical trials

A detailed summary of clinical trials that were conducted between 562 BCE and 1537 was published in 2010.4 It should be noted that during these early times there were no regulators other than kings and curious researchers, and no papyrus or parchment records were being checked by clinical research associates (CRAs). According to the article, the world's first clinical trial is recorded in the Bible’s “Book of Daniel.” This experiment was conducted by King Nebuchadnezzar of Babylon who conquered Jerusalem in 586 BCE and destroyed the Jewish Temple. During his rule, there was a time when all subjects were ordered to eat only meat and drink only wine to keep them in sound physical condition. However, when several of the king’s relatives who preferred to eat vegetables objected, the king allowed them just to eat legumes and drink water for 10 days. When the experiment ended and those eating vegetables appeared better nourished than the “meat-eating controls,” the king graciously permitted the vegetarians to continue on their diet. According to the article, this may have been the first time in the evolution of humans that an open-label clinical trial guided a public health decision.

More recently, the first “controlled” clinical trial was performed by Dr. James Lind (1716-1794), who while working as a ship’s surgeon, was appalled by the high incidence and mortality of scurvy amongst sailors. As a result, he planned and executed a comparative trial of the most promising cures for scurvy. His insightful and vivid description of the trial covers the essential elements of a controlled clinical trial. Dr James Lind's “Treatise on Scurvy” was published in Edinburgh, Scotland in 1753.5

Early concepts of direct data capture

Early computers had a type of disk storage composed of a thin and flexible disk using magnetic technology. The original “floppies” stored 384K of data. Later, 720K (double-density) and 1.44MB (high-density) disks were developed. In the mid-80’s, Dr. Fred Wolfe’s rheumatology clinical research unit in Wichita, Kansas had electronic records stored on a computer with backups on floppy disks. Dr. Wolfe had a plan to ship the floppy disks directly to the data management department for processing at the pharmaceutical company where the senior author of this article was running the clinical trial. This could have been the beginning of direct data capture and the elimination of source data verification (SDV). Unfortunately, the idea did not gain acceptance and was abandoned.

Evolution of paper-based data collection in clinical trials

Until the onset of the use of EDC systems in the 1990’s, clinical trial case report forms (CRFs) were composed of three-part NCR (no carbon required) paper with the original copy remaining at the site, and the other copies submitted to the sponsoring pharmaceutical company. The capture of clinical trial data directly onto the paper CRFs was not permitted. Instead, an independent original source record was to be maintained at the clinical research site and the basic function of the CRAs at the “site visit” was to compare the precision and accuracy of the CRF data transcribed from a source record onto a paper CRF, a process defined as source data verification (SDV). In order to complete the data validation process, the paper CRFs were then sent to the pharmaceutical company for data entry into the study database by two separate individuals, a process known as double-key data entry. These data were then compared and any inconsistencies resolved.

Once EDC systems were introduced, paper CRFs were eliminated, and data entry into the study database was shifted to the clinical research sites. This process did not change the SDV workflow as it was still no longer certain that site staff always entered the data accurately into the EDC system. While the data within the databases of the EDC systems could have been used as source data, it was not initially accepted that data entered directly into EDC systems could replace the “source” and “transcribed copy procedures” adopted during the use of paper CRFs.

EDC data workflow when paper source records are used

Figure 1: EDC data workflow when paper source records are used

The transition to direct data capture and eSource technologies

In 2011, a study evaluated the effects of SDV on the mean and standard deviation values of five variables when using an EDC system together with paper-based source records.6 The protocol was for a new treatment in men with lower urinary tract symptoms associated with benign prostatic hyperplasia. Data were collected from 566 subjects who signed informed consent and from 492 of these same subjects who were subsequently randomized. Results showed that post data validation, there were virtually no differences in mean values of the five key variables and a very minor decrease in the standard deviations.

In 2012, in lieu of initially capturing all source data using paper records, a US-based clinical trial evaluating the pharmacokinetics of a topically applied drug product allowed the clinical site to use direct data entry (DDE) into an EDC system at the time of the clinic visit.7 The clinical research team also implemented a risk-based monitoring (RBM) plan which defined the scope of SDV 1) if paper records were used, 2) the frequency and scope of online data review, 3) the role of centralized statistical monitoring (CSM) and 4) the criteria for when to perform in-person monitoring at the clinical research site. As a result of this novel approach to clinical research operations, there were1) no protocol violations as screening errors were identified prior to randomization, 2) minimal transcription errors as very few paper source records were used, and 3) major reductions in onsite monitoring tasks compared to comparable studies that used paper as source records. As needed, EDC edit checks were modified early in the course of the clinical trial, and compliance issues were identified in real time and corrected. From the safety perspective, there was rapid transparency and detection of safety issues. The clinical research site estimated that just in terms of data entry, it was able to save 70 hours of labor by eliminating paper as the original source records. The article concluded that when DDE and RBM are properly implemented, there could be major increases in productivity for both sponsors, clinical sites, and CROs, as well as reduced times to database lock and the statistical analyses.8

Direct data entry/eSource workflow

Figure 2: Direct data entry/eSource workflow

In terms of the impact on query generation, an article published in 2014 reported the results of study that evaluated the use DDE at the time of the patient encounter on the effect of the query effectiveness ratio (QER), defined as the ratio of the number of queries leading to change divided by the total number of queries issued.9 The study looked at 38,207 forms in a phase 2 study for the treatment of migraine. Results showed that of the 1,397 (3.7%) of forms that were queried, only 782 (2.0%) forms required changes to the database. Of these changes, 294 (37.6%) were in the concomitant medication forms and none of the changes had any impact on the study results.


Traditionally, the major roles of the CRAs were to train the clinical sites on the steps needed to execute the clinical trial and then to assess, via SDV, how the clinical sites could accurately transcribe data from paper source records onto paper CRFs. Data management wrote programs to look at logical errors and inconsistencies among and between forms, and communicated these issues directly to the CRAs or via a query directly to the clinical research site. The number of data entry errors was often used as the main metric to see how well a site was executing the study. As a result, this approach focused more on process conformance rather than on protocol compliance. Unfortunately, due to this manual effort, by the time an error was identified by the CRAs or data management, the same error could have been repeated multiple times due to the time between data entry by the site and monitoring by the CRAs.

Currently with the application of DDE in various software solutions including EDC, Electronic Patient Reported Outcomes (ePRO), Electronic Clinical Outcome Assessment (eCOA), Electronic Informed Consent (eICF), Mobile Medical Devices, etc., there is no need to question the data as the system validation efforts at the time of data capture assure the data integrity. Real-time remote monitoring by CRAs and data managers can now focus on 1) protocol compliance, 2) illogical results/outliers, 3) high risk issues related to safety and efficacy endpoints, and 4) evaluation of potential protocol deviations and violations. As a result, DDE logic errors that are systemic and not user specific can be corrected rapidly and communicated to those performing data entry. It is also now possible to evaluate the performance of the CRAs in detecting issues.

However, if debunking “accuracy” is no longer highly critical in the data validation process as one usually assumes, then what is? The recently published ICH E8(R1) emphases the need for “completeness (lack of missing information)” and “consistency (uniformity of ascertainment over time).”10 Thus, it is reasonable to expect that narrow-focused / accuracy-specific edit checks will play a lesser role over time, while at the same time, the part of aggregate (“across subject”) and complex statistical checks will expand. Data management and clinical operations professionals should not only be ready for this paradigm shift, but lead the transition. This advancement will demand a shift in perception and processes and equally importantly, the development of new skills. Let us now recognize that “Data Science” is no longer exclusively the domain of statisticians, but is now associated with all clinical research operations dealing with data collection, validation, and monitoring.

It is now time to follow the new definition of Data Quality documented by the Clinical Trials Transformation Initiative (CTTI), and redesign the data validation processes from eliminating all errors to the elimination of errors that matter to decision making. It is also time to focus on potential errors which have a meaningful impact on the safety of trial participants, the credibility of the results and the potential impact on the future care of patients.11


Technology is providing major contributions to the execution of clinical trials and has assisted in the improvement of DQI. However, there are still major challenges ahead to increase efficiencies, avoid redundancies, and ensure compliance with all aspects associated with data capture, data transfer and final reporting. The creation of lean documented processes and procedures will enable technology and human interactions to expedite clinical development, while at the same time protect the data and the basic principles that govern quality as documented in existing regulations.

With the implementation of DDE, RBM and technology solutions, we can now early in a clinical trial identify potential risks to the patients and to the study itself. Real time risk mitigation strategies must address those risks that are most critical for the evaluation of the safety and efficacy profile of the drug or device under study. Regulators can now shift their focus from how well a data point is transcribed from one medium to another, to issues that could put patients/study participants at risk and compromise the clinical trial outcome, such as lack of protocol compliance, incompetent or unscrupulous investigators and biased sponsors. It is now time to also avoid the profound effects on patients and sponsoring companies where studies fail due to lack of quality rather than lack of efficacy.

Jules T. Mitchel, MBA, PhD., is the President and CEO of THI Pharma Services Inc. Greg Gogates, BS, is the President and CEO of Fasor Technical Services, Inc. Alan Yeomans, BE, is the Regulatory Affairs Manager for Videoc Technologies. Tai Xie, PhD., is the CEO of CIMS Global. Luis A. Rojas, PhD., is the President and CEO of InCSD. Jonathan S. Helfgott, MS., is the Faculty / Program Coordinator / Senior Lecturer, MS in Regulatory Science & Food Safety Regulation Programs, at Johns Hopkins University. Vadim V. Tantsyura,DrPH Sr. Director, Data Quality Science and Programming, at BioMarin Pharmaceuticals, Inc.


  1. Mitchel, J., Sugiura, Y., Arnera, V. et al. When Should the Audit Trail Begin? Applied Clinical Trials, 16 June 2021.
  2. Gascoigne, B. History World. History of Writing Materials. From 2001 Ongoing.
  3. Papyrus: A Brief History. Papyrus: A Brief History – Dartmouth Ancient Books Lab 2016
  4. Bhatt, A. Evolution of Clinical Research: A History Before and Beyond James Lind. Perspectives in Clinical Research. 1(1): 6–10. 2010
  5. Lind, James: A Treatise of the Scurvy. A. Millar, London, 1753.
  6. Mitchel, J. Kim, YJ and Choi, J. et al. Evaluation of Data Entry Errors and Data Changes to an Electronic Data Capture Clinical Trial Database. Drug Information Journal, Vol. 45, pp. 421–430, 2011.
  7. Mitchel, J., Schloss Markowitz, JM, Hua (Helen) Yin, et. Lessons Learned from a Direct Data Entry Phase 2 Clinical Trial Under a US Investigational New Drug Application. Drug Information Journal 46(4) 464-471 2012.
  8. Mitchel J, Tantsyura V, Kim YJ et al. Query Effectiveness in Light of Risk-based Monitoring (RBM). Data Basics Winter, 2018.
  9. Mitchel JT, Cho T, Gittleman DA, et al. Time to Change the Clinical Trial Monitoring Paradigm. Applied Clinical Trials. 17 January 2014.
  11. CTTI’s QbD Project. Recommendations available at: 2015
Related Content
© 2024 MJH Life Sciences

All rights reserved.