Data Based Predictions


Applied Clinical Trials

Applied Clinical TrialsApplied Clinical Trials-03-01-2008
Volume 0
Issue 0

In the age of international trials, data drives the selection of golden sites and investigators to get it right.

Related Articles


An Improved Trial Model


Measures of Success


The Clinical Investigative Site Network

Historically, the selection of countries and investigators for clinical trials has been based on a combination of unsubstantiated feasibility questionnaires and legacy familiarity with a core group of investigators. In this vein, the determination of the number of countries/sites and subject recruitment forecasting traditionally has been based on assumptions that all sites start at the same time and perform similarly. This leads to overpromising to deliver within timelines, as the variability in initiation and performance among the sites has not been accounted for. The result is familiar to everyone: Most trials are running behind schedule in terms of enrollment targets and other milestones.

This article introduces a more informative approach to site selection based on the analysis of past performance and how the review of epidemiological and demographic data can help identify the "golden site profile." The prestudy phase is a great opportunity to use current information and past experiences with similar studies to objectively pressure test site selection and performance assumptions. Furthermore, the article showcases how this data can be used to forecast subject recruitment timelines and measures, such as CRF page volume and CRA/monitoring and data management resource requirements, to predict and prevent overstaffing and waste as well as understaffing and delays. Indeed, it's not enough to be smart at the start. There is a clear need for continuous proactive mitigation of project-related risks across the entire lifespan of the trial.

Power in accuracy

Generally, disease indication and study design drive the degree of variability in observed recruitment rates. Less complex designs in more common therapeutic indications with fewer treatment options increase the investigator's familiarity with trial procedures and increase the pool of potential subjects.

The distribution of recruitment rates across sites is likely to be asymmetric. This may result in the minority of sites recruiting the majority of subjects, leading to deviations from the planned subject demographic profile in the final analysis population. If sites that recruit higher numbers of subjects per month can be identified (golden site profile), then potentially the recruitment rate of chosen sites can be made more symmetric and subject recruitment can be more evenly distributed across sites. High recruitment by each site is beneficial in terms of the number of sites required (and hence the budget) to meet recruitment timelines. However, it is important to ensure adequate site resources to cope with the increased volume of work and the need for early monitoring to capture subject evaluability issues early so that data quality remains high.

The development of this golden site profile begins, where possible, with the evaluation of historic data, preferably from protocols with similar inclusion/exclusion criteria. The objective is to look at aspects of sites that would appear to predict higher recruitment rates. If historic data from similar trials are not available, then trials from within the same therapeutic area or similar diseases may provide indicators. Even when historic data come from very similar trials, it is important to recognize the limited applicability of the data and the need for a study-specific feasibility assessment using the initial assessment of the golden site profile as a sampling frame.

The primary aim of the feasibility assessment should concentrate on confirming/refining the golden site profile against which potential sites can be compared during the subsequent site identification process. This is achieved through provision of a detailed protocol synopsis (under a confidentiality agreement) together with a series of questions relating to:

  • site location and subject referral networks.

  • site makeup (e.g., site resources, necessary equipment).

  • anticipated subject recruitment and screen failure/drop-out rates.

  • competitive environment (e.g., the number of other studies being concurrently conducted in the therapeutic area).

  • reimbursement rates (it is sometimes useful to ask the investigator to specify the level of reimbursement that he or she would expect to recruit a subject into the trial. Although budgets should not be set based primarily on investigator feedback, the information is useful in assessing likely site interest moving forward).

For protocols with key inclusion and exclusion criteria, it is useful to assess potential recruitment through a series of questions, beginning with the general subject population of the investigator site and drilling down to the specific population for the proposed trial. This helps the investigator to more accurately assess likely recruitment rates, as it mirrors the investigator thought process during the trial. It is acknowledged that the accuracy of the data collected for recruitment and withdrawal rates will depend on the level of trial experience of the investigator and also on the level of dependency of the investigator upon advertising for subject recruitment. For investigators selected to potentially participate in the trial, these data will be scrutinized further at the prestudy visit.

The feasibility assessment process should be flexible to allow for refinement and checking of the golden site profile, especially for the more complex therapeutic areas and/or study designs. In these situations, it may be necessary for follow-up interviews to ensure investigator understanding and to allow for a reassessment of predicted recruitment.

The greater the variability in response to questions assessing recruitment rate, the greater the number of responses to the feasibility assessment required and the greater the number of questionnaires that need to be sent out. Once the golden site profile has been confirmed, the feasibility assessment can evolve seamlessly into site identification, where sites are contacted and their responses assessed in comparison to the profile. Those sites attaining the standard set by the golden site profile can have their prestudy visit scheduled (where the answers to the questionnaire are validated and additional information is sought by the attending CRA). Those sites that are just below the standard can be reserved until the end of the site identification process.

Underperformance in terms of actual recruitment compared to the response to the recruitment question in the feasibility assessment and/or prestudy visit is expected. The final assessment of site numbers and timeline projection should take this into consideration. Where data from the same sites and/or similar studies are available, the response to the feasibility assessment for this previous study and the actual recruitment can be used to provide an indication of the level of discounting. Another important factor is the time taken to obtain regulatory approval and site contracts. Once the countries and the number of sites within each country have been determined, these lag times and the discounted recruitment rates can be modeled to make any adjustments to site numbers required to give confidence in achieving recruitment timelines.

Continual remodeling during trial recruitment is also critical to identify and act upon deviations from the planned recruitment timelines and the impact of this on workload through changes in CRF volume and/or drug supply, as well as changes in the demographic profile of the final analysis population.

Sites that drive enrollment

So, how have these concepts been applied? Consider the following example. A stroke study required sites to be identified. The review of the historical data revealed three recent, similarly sized, large Phase III studies. The site distribution by region was: Asia Pacific (AP), Central and Eastern Europe (CEEU), Western Europe (EU), Latin America (LA), North America (NA), and the Rest of World (RoW).

Figure 1. Recruitment rates from a recent stroke study of the top 10% of investigators.

The majority of investigators who participated in the three studies were in Europe and North America (258 out of 390 unique investigator names and addresses). The largest number of investigators were located in the United States. The median and interquartile ranges of the recruitment rates (subjects per site per month) within each region showed that Central and Eastern Europe had the highest recruitment rates followed by North America and Western Europe (see Figure 2). However, a graph plotting the recruitment rates (subjects per month) of each site sorted in descending order separates the top 10% of investigators (see Figure 1). These 40 investigators recruit at least one subject per month.

A review of these top 40 investigators revealed similarities. The most actively recruiting investigators are:

  • located in more densely populated cities/countries

  • specializing in neurology with an emphasis on geriatrics

  • practicing within a large or general hospital

  • located in countries/cities with an older population.

Figure 2. Use of historical data allows for a summary (Median and Interquartile Range) of recruitment rates for stroke studies by region.

These findings allow us to conclude that the successful recruitment of subjects in a similar stroke study rests on the ability to identify investigators who fulfill the above criteria.

Leveraging enrollment forecasts

The next case study illustrates how the recruitment information in a Phase III oncology study is modeled to forecast site numbers, recruitment timelines, CRF volume, resourcing, and endpoint occurrences. Feasibility surveys were distributed by locally trained and experienced CRA staff to a number of potential sites identified jointly by Covance and a client from past oncology studies. Investigators were provided with a confidentiality agreement followed by a protocol summary and targeted feasibility survey designed to facilitate discussions between the investigators and their local CRAs. Sites were subsequently ranked into three categories based on:

  • responses to the feasibility survey

  • past performances in this oncology indication

  • input from the affiliate offices.

The final list of selected sites was then prepared by assessing the impact of the inclusion of each site/country on timelines.

A projection of cumulative subject recruitment was derived from the estimated site activation times at each site within each country and the recruitment estimates provided by each investigator at each site so that the final timelines and number of sites required could be agreed upon by project team members.

The first critical milestone for the project was the randomization of the first 80 subjects into the study so that the DSMB could review the safety data. Two projections were prepared based on the time for each site to be activated and each investigator's estimate of recruitment—one projecting the completion of the randomization of 650 subjects and the other the randomization of the first 80 subjects. As the study progressed, the actual recruitment rates at each site were used to project their own future recruitment potential. These revised recruitment rates were then used to reproject the randomization of the 650 subjects as well as the first 80 subjects; CRF flow; monitoring and data management workloads; and drug demand on a site, country, depot, and study level.

The first subject was randomized at the end of June 2004, and by the end of August 2004 it was apparent that the 80th subject would be randomized by October 22, 2004—five weeks ahead of the original schedule. The ongoing reprojection allowed the project team to ensure that all forms were retrieved and processed in a timely manner so as to avoid or mitigate any potential delays to the DSMB review.

Though arguably a luxury problem, the faster than originally anticipated recruitment of subjects presented a new set of challenges, specifically an increase in the demand for drugs coupled with the volume of work now being compressed over a shorter period of time. The ongoing and regular reprojection of subject recruitment on an individual site level has allowed and continues to allow the project team to ensure that there are sufficient amounts of drugs to meet the recruitment needs by shipping them to the depot and then from the depot to the site. In an extreme situation, sites can briefly and temporarily put a pause on recruitment to ensure that all subjects randomized into the study have access to the drug.

High recruiting sites were identified and their recruitment trends observed to ensure data quality was maintained and that there were sufficient monitoring resources to retrieve the data. As a result, three sites—which had originally estimated randomizing between no more than 12 subjects per year but actually randomized 16 or more subjects within six weeks of activation—were identified and managed by capping recruitment on a site level; all three sites were also audited. It became apparent in October 2004 that the randomization of all 650 subjects could potentially be completed by the time the sites in the slowest regulatory approval country were expected to be activated. This prompted a decision in January 2005 to cancel the last Investigator Meeting, as there appeared to be little benefit in activating these sites.

A projection of the time to all endpoints occurring (hence clinical cut-off declared) based on the actual and projected subject recruitment (see Figure 3) and the sample size calculations in the protocol was prepared. The actual number of endpoints is compared against the projected number to ensure that the timeline for clinical cut-off can be more accurately predicted and that the study is progressing according to the expectations described in the protocol. Fewer than expected endpoints occurring may result in the time to clinical cut-off being extended and more data being monitored and processed. Conversely, more endpoints than expected may result in a shorter time to clinical cut-off and less data to monitor and process.

Figure 3. Overall projected recruitment with Confidence Intervals (CI)-a range that shows the likely range of outcome-including capping of sites/countries (date of update: December 10, 2004).

The CRF flow and the monitoring and data management workloads are then derived from the subject recruitment and endpoint forecasts. Subject recruitment and endpoint forecasts also allow the derivation of laboratory kit volume, investigator budget payments, and drug demand.


The need for accurate, proactive trial planning and trial delivery of high-quality data—on time and without surprises—has never been greater. The competitive environment for subjects and site resources, together with the need to bring medicines to market sooner, is driving accelerated recruitment timelines.

While previously it might have been appropriate and relatively easy to alter countries and/or site numbers during the trial if original plans failed to meet expectations, today the impact such a midstream change can have on the overall timelines is less viable, if at all. In other words, simply reacting to enrollment challenges midtrial isn't enough to ensure success. Instead, a proactive, predictive approach to enrollment is required, based on quantitative forecasts of trial performance.

Forecasting models are only as good as the assumptions made. It is necessary to be as realistic as possible when making initial assumptions and to constantly re-evaluate original forecasts as the trial unfolds in order to track both negative and positive deviance and proactively adjust not only enrollment plans but also downstream resource management plans accordingly.

Michelle Jones, MSc, CStat, is senior director, information modeling, operational strategy & planning group, late stage development services, Covance, 210 Carnegie Center, Princeton, NJ 08540-6623. Stephen Jones,* MSc, CStat, is global director, statistical sciences, late stage development services, Covance, email:

*To whom all correspondence should be addressed.

© 2024 MJH Life Sciences

All rights reserved.