This study is grounded on the idea that in order to improve study startup performance, you must first be able to identify those key drivers of performance.
Related Articles: Cracking the Code on Study Start-Up
Study startup is the most challenging and important stage of any clinical trial. At the same time, study startup has the lowest performance scores and the greatest variation in performance compared to any of the other stages of clinical trials.1 For clinical study managers, the key to high performing studies is appropriate governance.2,3 That is, they should be able to track performance as the study progresses in order to appropriately manage performance.4 But of the hundreds of activities involved in starting a clinical trial, what are the key indicators of performance that a clinical study manager should use to know if the study startup is going well? This issue is important because clinical study managers worry about their “blind spots” or the innocuous issues that come back to undermine the trial. When a trial is outsourced, governance issues are magnified as clinical trial managers try to assess performance across operational boundaries.
CROs focus on study startup as well. They want to deliver high levels of performance on their contract, but may be unsure of what to emphasize in the flurry of startup activity. This decision is also made more complex by a myriad of clinical, financial, logistical, and commercial concerns. CROs need objective evidence of the key drivers of study startup in order to meet sponsor expectations and make rational service quality investments.
The central thesis of this paper is that, of all the variables involved in a study startup, we can identify the key drivers that have a disproportionate impact on study startup performance. The goal is to enhance the ability to enhance the effectiveness and efficiency of trial monitoring and governance and give clinical study managers a tool by which they can manage clinical trial performance.
How we did this research
In order to identify the key drivers of study startup, we took a two-stage approach to the research. We first conducted a series of interviews with experienced clinical trials managers to identify all of the drivers of study startup performance and then statistically examined the relationship of each driver to study startup performance in a survey. This research is part of the Clinical Trials Outsourcing Performance (C-TOP) study, an ongoing collaboration between Drexel University and CRO Analytics. The research was conducted under Drexel University IRB approval.
For the first stage, we interviewed three dozen pharmaceutical and CRO executives with substantial experience in managing clinical trials. Each respondent was asked to identify the key drivers of clinical trial performance, why they thought it was important, and what actions by the clinical study team typically lead to success on that driver. The first key indicator mentioned by most of our respondents was the ability to adhere to operational timelines such as the project and operations plans, the study specific convention, forms and documents, regulatory submission, IT system setup, and the feasibility assessment.
Adhering to timelines was also mentioned in the context of managing investigator recruitment. Even though this factor could be categorized under the general ‘timeliness’ theme above, most of our respondents felt that it belonged with the other investigator-related factors such as the ability to identify qualified investigators, adhere to a timeline for recruiting investigators, and the investigators meetings.
Many of our respondents noted that timelines are very context-dependent. Just measuring the amount of time that it takes to complete an action provides inadequate information. Consider a trial in which it takes 21 days to complete the study specific convention. Is this good performance? The answer, of course, is that it depends on the study. Respondents felt that each study is unique enough that performance assessment items should be constructed in a way that accounts for the expectations associated with that trial.
Finally, the respondents in our interviews identified two key people that drove study startup performance – the project managers and clinical research associates (CRAs). For project managers, it is important that they be (1) knowledgeable about how to conduct clinical trials, the specific details of the client’s trial, and about good clinical practice and regulatory issues and (2) have good interpersonal skills in the sense that provided timely and effective communication, collaborate well, are proactive problem solvers, and have the ability to recommend effective solutions. The respondents did not delve into the specifics of CRA performance, so we created a global assessment for their performance.
We continued to conduct interviews until we felt we were not gaining any new insights – a total of 36 interviews between October of 2011 and July of 2012. We illustrate the relationships as described in our interviews as they appear in the final validated model in Figure 1.
At this point, we organized our insights into a series of questions to assess how the clinical study team performed on each of the performance factors. We were careful to construct the items in a way that was consistent with our respondent’s emphasis and depth of detail. The questions were organized into a survey instrument and edited for clarity. We then asked ten executives to review the instrument for coverage of important drivers, the level of detail of the questions, and clarity. We then posted the survey online and again reviewed the items for clarity and ease of online presentation.
In order to statistically validate the instrument, we soli d subjects via email in a purposive sampling using the online tool. Table 1 shows how we measured each of the constructs. We received 65 completed instruments from respondents with an average of 16.5 years of experience in the pharmaceutical industry and 9.5 years of experience overseeing CRO contracts. The trials that the subjects evaluated were from a variety of phases (Phase 2-22%; Phase 3-35%; Phase 4-43%), specialties (Neuro, Resp, Onc, CV, DM, ID) and regions (N. America, Europe, Asia, Russia, S. America). The overall average of study startup performance was 5.6 (on a scale of 1 – 10) and was lower compared to sales and contracting (6.2), study conduct (6.0), and study closeout stages (6.6).
The model in Figure 1 was estimated using Partial Least Squares (PLS) path analysis using SmartPLS. We subsequently describe this analysis as the predictive model. As covariates, we included exogenous paths between project manager performance and operational timelines, adhering to timelines for recruiting investigators, investigator meetings, and CRA performance (not illustrated in Figure 1). PLS provides regression coefficients to estimate the structural relationships between the performance factors and loadings for the measurement relationships. All of the predictors were on a 1 to 10 scale and centered prior to estimation, so the coefficient describes the effect of the predictor on startup performance at the average of all the other factors. The coefficient estimate, for example, for adhering to operational timelines is .45. This is interpreted as ‘a 1-unit increase (or 10%) increase in adhering to operational timelines gives a .45 unit (or about 4½ %) increase in study startup performance at average levels of project manager performance, investigator recruiting, etc.
The model performed well; a substantial number of the paths were significant and the model explained high levels of variance in the endogenous variables (project manager R2 = 81%; operational timelines R2 = 71%; study startup performance R2 = 73%). The two reflective variables (project manager knowledge and interpersonal skills) demonstrated discriminant and convergent validity. We originally thought that the investigator-related variables (ability to identify qualified investigators, timeline for recruiting investigators, and investigator meetings) would constitute a single ‘investigator’ latent variable. This measurement structure exhibited poor psychometric properties and so the variables were broken out as individual contributors to study startup performance.
What We Found
The means, standard deviations for the variables used in the statistical analysis are shown in Table 2. Three findings are important in this analysis. First, the performance indicators are fairly low. The average performance across all of the measures was 6.1 – lower than what we expected given all of the attention and effort given to assessing clinical trial performance in recent years. Second, there was wide variation in the performances. The average standard deviation was 2.28, meaning that about 2/3 of the performance scores fell between 3.8 and 8.4 and the other 1/3 of the scores fell outside these bounds. The third finding was that all of the variables had a positive correlation with study startup performance. While this is not surprising in that all of our expert respondents in the interviews described these as being important to performance, these positive correlations take on more significance in light of the results of our predictive model, which we discuss next.
Contrary to the results of the interviews and correlations, the predictive model did not show that all of the drivers have positive and significant impact on startup performance. Of the seven key drivers of study startup performance, two were substantial and significant (operational timelines and project manager performance), two were significant and positive (personnel aligned with proposal and adhering to timelines for recruiting investigators), two were significant and negative (ability to identify qualified investigators), and one was insignificant (CRA performance). These drivers are illustrated in Figure 2.
The explanation for the differing results is that they are different analytical techniques. The correlations only consider the individual drivers in isolation. That is, they only consider the isolated effect of that variable on performance without considering the presence of the other variables. The predictive model estimates the relationship between each of the predictors and study startup performance at the average levels of all the other variables. Adhering to operational timelines, for example, has an estimated coefficient of .45. This means that a 1-unit (10%) increase in meeting operational timelines yields a 4% increase in study startup at the average of project manager performance, recruiting investigators, etc. As a result, this predictive model better describes the environment in which managers must make decisions about clinical trials.
Adhering to operational timelines (coeff= .45, sig= 3.09, p< .001) had the greatest impact on study startup performance. The operational timelines that drove this impact on performance were the project & ops plans (coeff= .16, sig= 7.43, p< .001), study specific convention (coeff= .15, sig= 4.33, p= <.001), forms and documents (coeff= .12, 2.94, .003), and the regulatory submissions (coeff= .05, sig= 3.25, p= .001). Adhering to the timelines for the feasibility assessment (coeff= .03, sig= 1.39, p= .16) and the IT system setup (coeff= -.01, sig= .18, p= .86) did not have a significant impact on performance.
The project manager’s performance (coeff= .33, sig= 3.32, p= .005) was another substantial and significant predictor of study startup performance. Both the knowledge of the project manager and their interpersonal skills drove this positive impact on performance. Project manager knowledge was assessed by asking about their general knowledge of conducting clinical trials (loading= 10.9, sig= 2.07, p= .04), specific knowledge about the details of this trial (loading= 15.1, sig= 2.14, p= .04), and their knowledge of GCP and regulations (loading= 10.8, sig= 1.96, p= .05). Project manager interpersonal skills were assessed by evaluating their ability to provide timely and effective communication (loading= 16.9, sig= 2.53, p= .02), collaboration skills (loading= 19.0, sig= 2.69, p= .01), proactive problem solving (loading= 21.3, sig= 2.77, p= .01), and ability to recommend effective solutions for this trial (loading= 20.5, sig= 2.66, p= .01).
Aligning personnel to what was promised in the proposal (coeff= .24, sig= 2.05, p=. 04) and adhering to the timeline for recruiting investigators (coeff= .22, sig= 2.59, p=. 01) both were significant and positive predictors of study startup performance. We originally modeled all of the investigator variables (identifying, recruiting, and meetings) as a single (formative) latent variable. Instead, we found that each of these variables exerted independent effects on startup performance that were best modeled independently. We did not include the timeline for recruiting investigators with the other operational timelines based on the feedback from our respondents in phase 1 of the study, who felt that the timelines for recruiting investigators was qualitatively different from the other operational timelines.
Negative Effects on Performance from Ability to Identify Investigators and Investigator Meetings. To our surprise (and presumably the experienced managers who suggested these variables), the other investigator variables had negative influences on study startup performance. The ability to identify qualified investigators (coeff= -.18, sig= 2.35, p= .02) had negative and significant effects on study startup performance. This means that for every 1-unit (or 10%) increase in the CROs ability to identify qualified investigators, study startup performance actually declined by about 2%. Our model does not identify a reason for this unexpected finding, but we speculate that the reason might be that an overdeveloped capacity for identifying investigators might inhibit a more collaborative search utilizing the resources of both sponsor and vendor.
The negative effects from investigator meetings (coeff= -.16, sig= 2.13, p= .04) is perhaps more obvious. For clinically active investigators, the meeting is likely to be perceived as intrusive and comes with additional administrative burdens. Negative perceptions increase as more time and effort is devoted to these investigator meetings. Finally, CRA Performance (coeff= -.04, sig= .53, p= .59) had insignificant effects on startup performance.
While everyone wants a positive start to a clinical trial, study startup has the worst performance compared to any of the other stages of a trial. In this study, we have found that it is possible to isolate those startup drivers (personnel aligned with proposal, operational timelines, adhering to timelines for recruiting investigators, and project manager performance) that are positively associate with study startup performance from those that have no effect (CRA performance) or negative effects on performance (ability to identify qualified investigators and investigator meetings).
We are not arguing that clinical study teams should not try to identify qualified investigators or hold investigator meetings. You have to do these activities and the experience of our manager interview and the positive correlations confirm that these are good things. When it comes to improving startup performance, however the four variables will change startup performance. This finding is important because executives in the real world must make decisions at the margin. In a world with limited time and resources available to monitor clinical trials, they must decide which variables they should track in order to know how their clinical study team is performing or how a vendor is performing on a contract.
Is there really a need for additional metrics, like those described in this study, when we already have dozens of operational metrics commonly already in use? We believe that the metrics we describe in this paper are a necessary complement to operational metrics, which often lack validity. Imagine that a clinical study manager finds that it took 42 days to recruit investigators. Does that mean that the clinical study team is performing well or performing poorly? The answer is that ‘it depends’ – on the particular type of study or the panel of available investigators, the presence of other large trials in that area, etc. In isolation, it is hard to know what a number like ‘42’ means. Even if you have benchmarks, they will only tell you about average performance in similar trials and not whether those were high performing trials. In summary, it is critical to have performance metrics like those described in this study in order to understand the meaning of operational metrics.
For CROs, who must decide where spend the next dollar to improve performance. The results of this study allow them to prioritize their investments to where they will have the maximal effects (aligning personnel to proposal, operational timelines, project manager performance, and adhering to recruiting timelines) and avoid investments that will have insignificant (CRA performance) or negative effects (ability to identify qualified investigators and investigator meetings). Based on the results of this study, these performance investments can even be fine-tuned to maximizing the return. For example, improving operational timelines (coeff= .45) by 10% will have twice the effect on performance compared to improving the timelines for recruiting investigators (coeff= .22).
In the future, regulators will want to know that sponsors have some rational monitoring function in place.5 The results of this research provide a scientific and validated approach to monitoring study startup.
Michael J Howley PA-C, PhD, Associate Clinical Professor, LeBow College of Business, Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, [email protected]
Peter Malamis MBA, CEO, CRO Analytics, LLC, 6139 Stoney Hill Road, New Hope, PA 18938, [email protected]