With a unique business model, pharmaceutical companies and their subcontractors need to find every way possible to shorten drug development times consistent with patient safety. The challenges facing commercially oriented organizations developing prescription drugs are substantial.
Phase III clinical trials are often long, complex and costly as compared to other steps in drug development. They are usually the final part of clinical testing to confirm the safety and efficacy of a potential drug in comparison to a placebo or commonly used standard of care treatment. Phase III studies enroll and treat large number of patients, between several hundred to several thousand depending on the study design. It is common for these studies to take place in multiple centers, often involving several countries. Accurately forecasting the time required to complete these studies is critical, for reasons ranging from operational in nature to strategic in scope.
Since 2007, the U.S. has required certain information about all clinical trials conducted under FDA auspices to be submitted to an existing database, ClinicalTrials.gov, regardless of whether they were funded by commercial or non-commercial organizations. The database over time has become an important source of information about the clinical trial landscape for both types of sponsors.
This paper uses data from the federally mandated fields of ClinicalTrials.gov to determine how accurately pharmaceutical companies are able to estimate the time required to complete Phase III clinical trial patient enrollment and treatment. In this research, we examine the time initially estimated by the organization conducting the study to complete patient enrollment and treatment, comparing that estimate to the actual amount of time the study took. Some trials are estimated more accurately than others. We also examine factors, in a multivariate model, which are related to better clinical trial completion date forecasting.
We first find that the pharmaceutical industry’s ability to estimate Phase III study patient enrollment and treatment completion time is substantially better than previous reports have indicated. Consequently, fewer studies are behind schedule than might otherwise have been thought to be the case. Our results differ to a large extent from other analyses because, as we contend, the ClinicalTrials.gov database provides a more comprehensive and testable source of relevant data than other work on this topic has used. Second, the results from a multivariate model highlight the importance in accurate forecasting of an organization’s clinical trial experience in the therapeutic area specific to the individual clinical trial in question. To around 30 studies in the relevant therapeutic area, the more studies an organization does the more adroit an organization becomes in estimating Phase III clinical trial completion times.
Operational and Financial Implications
Correctly forecasting completion times is essential in such a complex undertaking as a drug development program. At an operational level, senior drug development executives must rely on the time estimates to allocate scarce management, staffing and other financial resources. These resource allocation decisions are particularly important in Phase III studies because of their operational complexity. The time lost in misestimated, and potentially mis-resourced, trials may have financial consequences that go far beyond the direct expenses involved. In addition to the direct costs associated with poor estimates, senior management must consider the opportunity costs of other trials that should, and could, have been conducted, had there been more accurately estimated completion times for their current studies.
Understanding how long it will take to complete a Phase III clinical trial has a number of significant strategic financial consequences as well. Major investment decisions by corporate management, and the investment community, are heavily influenced by the remaining patent life of a compound in late development. A major portion of a compound’s patent life is consumed during the drug development process. Potentially attractive compounds can, as a result, be abandoned during the R&D process because inadequate time remains in the product’s patent life for the compound to be commercially attractive. Understanding how long Phase III clinical trials will take to complete can ultimately have a major influence on the strategic decision to bring a new drug to market or not.
Challenges in Estimating Trial Completion Time
Enrolling and treating eligible patients is the most variable portion of most Phase III clinical trials.1 Estimating this portion of a Phase III study can pose daunting challenges. Schroen and others assert that the ability of even the most experienced statisticians and clinical trial managers to predict a study’s completion date is limited at best.2 The very nature of Phase III clinical trials can make estimation particularly problematical. Across all industries, managers are poorer at estimating completion times, and costs, when something is far away in distance or culture.2 Yet, many Phase III clinical trials involve clinical sites from around the globe, requiring that both the study design and execution be sensitive to the medical practices of sometimes vastly different cultural settings. Often the geographic distances involved can be immense.
Studies to date about the ability of managers to correctly estimate clinical trial completion dates have consistently emphasized how far these estimates have been from the actual completion times. In 2004, CenterWatch Monthly, in a widely cited study, indicated that 70% of all trials are delayed from one to six months, chiefly due to patient recruitment problems3 More recently, CenterWatch Monthly has dropped that delayed percentage to around half of all trials. Other studies report longer delays, with significant variation across therapeutic areas.4,5 These reports though have usually not had access to actual pre-study estimating activities. The analyses have consistently drawn their data from ad hoc collections of information about completed studies, or more often, from the opinions reported in convenience surveys of industry participants. ClinicalTrials.gov provides estimated study completion times derived before studies actually began, along with the actual completion dates. Unlike convenience opinion surveys, ClinicalTrials.gov data are not reliant upon the opinion of individuals who may or may not be involved in the active management of the clinical trials about which they are reporting.
The ClinicalTrials.gov Database
ClinicalTrials.gov has become a widely cited research data source.6,7,8,9,10 Our research uses only the mandated data fields in ClinicalTrials.gov. Data completeness and accuracy are important considerations in the use on any dataset, even federally mandated ones with specifically required variables. The authors previously analyzed data completeness and quality for the variables reported in this paper.11 We concluded that the database for these mandatory variables constitutes a potentially valuable research resource. With the exception of site identification detail, the data are currently complete enough to undertake potentially useful research. Missing data never exceed 3%, with the exception of site level identification detail, e.g., the exact address of the clinical site. Even with the site level identification detail variables, missing data were less than 10%. These missing data were concentrated in several companies, which have in more recent years begun to enter much more complete data about site detail.
The research team, in order to review the ClinicalTrials.gov database, built an XML parser that compiled an SQL database with all the current information contained within ClinicalTrials.gov website. Inconsistent coding was a large obstacle in reviewing the data. The verbatim field “single patient duration” was automatically recoded into numeric days using a SQL program. Outcomes with “single patient duration” that contained vague language such as “throughout hospital stay” and outcomes with “single patient duration” that was blank were classified as “uncodable.” The verbatim field “conditions” was coded into MedDRA 16.1 therapeutic areas. Individual MedDRA therapeutic areas were coded into dichotomous variables for inclusion in the ordinary least squares regression analysis. The same was done for the 20 countries with the largest number of listed clinical sites. The research team then performed a systematic review of all Phase III studies led by industry sponsors that had been received as of Feb. 25, 2014 as defined by First Received Date.
We think that open text fields such as inclusion/exclusion criteria or endpoints may be the most problematical for completeness. These are verbatim fields that we do not code, but rather count. Organizations almost always put in fairly extensive information for the mandated variables. Nonetheless, we have no way of knowing if all the relevant data have been submitted. For example, companies, for whatever reason or combination of reasons, may simply stop entering inclusion/exclusion criteria after a certain number. However, we deemed this situation unlikely since the actual criteria counts follow a near perfect negative binomial distribution.
We began with a dataset of 2,093 pharmaceutically sponsored Phase III clinical trials that started during, or after, 2008 and completed by 2013. Study completion is defined as the date of the last patient’s last visit. The regulations require that a study enter the estimated patient enrollment and treatment completion date within 21 days of the first patient’s enrollment at the first site. Fifteen percent of the studies were excluded from our analysis because the study failed to meet this 21-day requirement, giving us a total number of 1,787 studies. Virtually every one of the missing 15% of excluded studies eventually entered the estimated completion date. However, we wanted to be certain that estimated completion dates did not include information that the study team could have learned during the course of the study itself. Hence, we rejected those studies that did not enter the estimated study completion dates within the required 21-day time period.
Organizations submitting data to ClinicalTrials.gov appear to monitor the anticipated date of the last patient’s last visit fairly closely; 78.4% of all Phase III studies within our ClinicalTrials.gov dataset show one or more changes in the estimated completion date during the course of the study. We assume that these changes are made in the estimated completion date to reflect information coming to the study team leadership on the status of active patient enrollment activities. It is important to note that this current analysis uses the original estimated completion date in this paper. Although the originally estimated date disappears if the field is updated in ClinicalTrials.gov, the original date remains embedded in the database and can be retrieved.
The variables used for this paper can be found in Appendix A. The complete dataset rules used to select the studies used for the analyses can be found in Appendix B.
We implemented bidirectional stepwise AIC for model selection12, adding a second order term to the model where second order terms were significant. In all cases we tested for multi-collinearity. The maximum of the quadratic terms (-b/2a) occurred outside the range of the actual data, suggesting that the log transformation of those variables might improve the overall model fit. It did. Open source 64-bit R III.1.1 was utilized for the analyses along with the StepAIC package.
Pharmaceutical companies provide an actual date for the first patient’s first visit along with the estimated date of the last patient’s last visit. We labeled this variable “estimated time” in field. When the study completes, the same organization resubmits the actual completion date. We have labeled this time period “actual time” in field. The difference between the two is our dependent variable: estimating accuracy. The difference is reported in days-the smaller the number, the better the estimate.
Two thirds of Phase III studies complete within one month of their originally estimated completion date. This is a substantially higher percentage than those reported in other studies.
Table 1 data may give undue weight to shorter studies. Shorter studies will have a greater chance of completing more quickly in absolute terms, that is, actual days, simply because the studies are designed to be shorter. Consequently, we also report the accuracy of the original estimate as a percentage of the study’s designed length, thereby correcting for any inherent bias due to study design length. In Table 2, we see no appreciable difference from Table 1. The results in Table 1 are not a reporting artifact of study design length.
Two thirds of all Phase III studies complete within a 10% estimating window. This percentage climbs to nearly 80% when we use a 25% estimating window.
There is, however, variation in estimating accuracy. Our multivariate model examines the potential relationship between a fairly large set of independent variables and our dependent variable, estimating accuracy. In this model we measure estimating accuracy as a percentage of the trial’s designed length. Again, we do this to control for the length of the study design. Most of the original variables used from Appendix A do not appear in the final model.
Not all the remaining variables in the model are each statistically significant at the .05 level, but they remain in the model because this configuration of variables maximizing the explanatory power of the total model. Even the variables that are not significant at the .05 level are close enough to merit attention. A negative before the Beta weight means the difference between the estimated and actual decreases, that is, estimating accuracy improves.
The variable, total company experience in the relevant therapeutic area, stands out. This is measured as the number of clinical trials in the dataset identified with the sponsor company in a particular study’s therapeutic area. We do not usually know if this study was done directly, or with a contract research organization (CRO). The therapeutic area experience variable should be viewed in conjunction with the square of the same term. The squared term alerts us that total company experience in the relevant therapeutic area is not a linear solution. Estimating accuracy improves with each study until about thirty studies, at which point the accuracy does not significantly change.
Although not significant at the .05 level, total company study experience remains in the final model. The variable is measured by the total number of studies done by a specific company outside the therapeutic area for the study in question. It appears that overall experience adds some degree of estimating improvement, but not nearly the amount represented by experience in the study’s therapeutic area.
Having at least one site in the U.K. and/or Canada significantly reduces estimating accuracy. Site activity has no impact in other country on estimating accuracy. The more the inclusion/exclusion criteria, the poorer the estimate. It appears that study teams may not always be able to anticipate the patient enrollment consequences of added inclusion/exclusion criteria. In contrast the more the study arms, the better the estimate.
It is worth noting that one set of variables does not appear in the final model, the individual study’s therapeutic area. Most probably, there were important differences in the actual completion times of studies across the individual therapeutic areas. And, it may also very well be that the study teams in various therapeutic areas differed in their ability to estimate accurate completion dates. Our results indicate though that if any differences existed in estimating accuracy across the therapeutic areas, the differences were not significant. Within the bounds of statistical significance, it seems that the specific study teams are able to anticipate any differences in actual study completion times that may be associated with the different therapeutic areas.
Discussion and Conclusion
The estimating accuracy figures in Tables 1 and 2 are likely conservative. Most certainly a number of the studies in our analysis experienced some suspension period, or clinical hold. The structure of the database though did not allow us to establish the total time any study experienced this suspension. Organizations conducting a clinical trial can indicate in ClinicalTrials.gov the current status of that clinical trial As a result, we know that at any given time, approximately 2% of the studies are on clinical hold. The actual cumulative percentage is no doubt higher. But in any event, the estimating accuracy figures in Tables 1 and 2 include these suspended time periods, and hence, are most likely on the conservative side.
We are aware that some degree of estimating conservatism may be present on the part of individuals submitting these estimates. Conversationally known as sandbagging, staff entering estimation dates may be using very conservative estimates so that public analyses of the ClinicalTrials.gov data will show their organization’s operational effectiveness in the best possible light. While this may be true to some extent, we doubt it is major reason for the differences reported in our research and other studies. Third-party observers can easily see if the ClinicalTrials.gov estimates systematically vary for a specific company. In addition, the actual study completion date for each clinical trial, as reported in ClinicalTrials.gov, is available by company.
It is also possible that the estimated and actual dates entered in the federally mandated, public, ClinicalTrials.gov database are consistently different than the dates used internally by companies’ clinical operations management. Again, we doubt this as it would require keeping two sets of dates, the value of which is unclear.
We think our results differ from much of the other published material and public presentations because of the relative strength of the ClinicalTrials.gov database as a research source. Missing data are not an important issue for mandated variables. Moreover, the growing research use of the database has not led to published concerns about the internal consistency of the data. While often very useful in framing the estimating and clinical trial delay questions, earlier studies did not have access to a ClinicalTrials.gov database that has in recent years, grown substantially in size and quality.
Throughout, we focus our analysis on the degree to which studies take longer than originally anticipated, largely in response to the publications that emphasize the degree to which clinical trials are behind schedule. That studies finish earlier than the estimated completion date does not though change the overall findings of this research. Industry-sponsored clinical trials are not consistently behind schedule. Companies are better at estimating actual patient enrollment and treatment completion times than many may have thought to be the case from other studies.
The ability to develop realistic schedules is largely a consequence of more organizational experience conducting clinical trials, especially in a study’s therapeutic area. This may have important consequences when a pharmaceutical company decides to engage a CRO to conduct the sponsor company’s Phase III clinical trials. It may also be critical when a pharmaceutical company moves into a new therapeutic area and decides to conduct the study internally. Experience in the study’s therapeutic area is far more critical in correctly estimating patient enrollment and treatment study completion times than simple organizational experience running clinical trials in any therapeutic area.
With a unique business model, pharmaceutical companies and their subcontractors need to find every way possible to shorten drug development times consistent with patient safety. The challenges facing commercially oriented organizations developing prescription drugs are substantial. The ability to estimate clinical study patient enrollment and treatment completion times, and remain on schedule, is not a problem to the degree that many report to be the case.
* Harold E. Glass, PhD, is Research Professor, Department of Health Policy and Public Policy, University of the Sciences in Philadelphia, email: firstname.lastname@example.org; Lucas M. Glass is Vice President, Routine Recovery, LLC; Jesse Glass, PhD, is candidate, Temple University; Phuong Tran, PharmD, RPh, MBA, is Principal Product Manager, SunshineMD, LLC
* To whom all correspondence should addressed