Study startup underperformance and inefficiency has been a problem for executives who manage clinical research trials for decades.
Related Article:The Next Generation of Clinical Trial Performance Measurement.
Study startup underperformance and inefficiency has been a problem for executives who manage clinical research trials for decades. To address this issue, a two-phase research project to identify the key drivers of study startup performance was performed. In the first phase, interviewed clinical trial managers were asked to identify all of the drivers of study startup performance. The data was synthesized into a survey instrument that was tested in the second phase of the study. Statistical analysis demonstrated that the drivers did not all contribute equally to study startup performance. Four of the drivers were significantly associated with study startup performance (adhering to both operational and recruiting timelines, aligning the study personnel with proposal and project manager performance); one of drivers had no effect on performance (how well the clinical research associates performed); and two drivers were negatively associated with performance (the ability to identify qualified investigators and managing investigator meetings). The results are important for two groups. Managers responsible for study startup oversight can use these results to assess clinical study team performance and better manage ongoing trials. For organizations that conduct clinical research, these results can guide investments in study startup performance.
Study startup is the most challenging and important stage of any clinical trial. At the same time, startup also has the lowest performance scores and the greatest variation in performance of any of the other stages of clinical trials.1 For clinical study managers, the key to high performing studies is appropriate governance.2,3 That is, they should be able to track performance as the study progresses in order to appropriately manage performance.4 But of the hundreds of activities involved in starting a clinical trial, what are the key indicators that they should watch to know if the study startup is going well? This issue is important because busy clinical study managers worry about their “blind spots” or the innocuous issues that come back to undermine the trial. When a trial is outsourced, governance issues are magnified as clinical trial managers try to assess performance across organizational boundaries.
CROs also focus on study startup as well. They want to deliver high levels of performance on their contract, but may be unsure of what to emphasize in the flurry of startup activity. This decision is also made more complex by a myriad of clinical, financial, logistical, and commercial concerns. CROs need objective evidence of the key drivers of study startup in order to meet sponsor expectations and make rational service quality investments.
The central thesis of this paper is that, of all the variables involved in a study startup, it is possible to identify the key drivers that have a disproportionate impact on study startup performance. The goal is to enhance the ability to enhance the effectiveness and efficiency of trial monitoring and governance, and give clinical study managers a tool by which they can assess and manage clinical trial performance.
Overview of the research
In order to identify the key drivers of study startup, we took a two-phase research approach. In the first phase, we sought to identify all of the important drivers of study startup performance by asking a broad variety of experienced clinical trial managers to identify the performance drivers. This approach reduces subjectivity bias because the identified drivers will not be our opinion, but that of a broad group of experts. In the second phase, we compared all of these drivers in a statistical model to see which of the drivers from phase one had the most substantial and significant impact on study startup performance. This research is part of the Clinical Trials Outsourcing Performance (C-TOP) study, an ongoing collaboration between Drexel University and CRO Analytics examining all of the phases of clinical trial performance. The research was conducted under Drexel University IRB approval.
Phase One Qualitative Interviews
For the first research phase, we interviewed three dozen pharmaceutical and CRO executives between October 2011 and July 2012 who had substantial experience in managing clinical trials. The goal of phase one was to identify a broad perspective on all of the potentially important drivers of study startup. We wanted a variety of views in order to reduce potential bias from tapping into a single perspective of clinical trials measurement. We continued to conduct interviews until we felt we were not gaining any new insights. Both authors were present for each interview and each took notes to record the responses. Each respondent was first asked to describe what they considered to be the important drivers of study startup performance. We encouraged respondents to list as many drivers as they thought affected performance to obtain the most comprehensive list possible. In order to associate each factor with performance, we asked them why each factor was important and what actions by the clinical study team were associated with high performance in that factor.
The raw data that resulted from the interviews was a two-column list for each interview – the drivers on the left and the actions that led to successful performance of that factor on the right. In the interviews, we found that respondents began to spontaneously create associations (either a correlation or time-dependency) between the factors. For data management, we continued this process of and synthesizing and re-ordering the data by associating factors with each other and then linking them to their performance activities. The ultimate result of this process was a list of performance drivers along with the activities associated with each driver.
Phase One Results
Adhering to timelines was the performance driver that most of our respondents mentioned early in the interview. The timelines fell into two distinct areas – recruiting and operational timelines. Adhering to recruiting timelines refers to the ability to bring investigators into the clinical trial in the planned amount of time. (Note: There was a mix of opinions, but most of the respondents felt that recruiting investigators was a function of study startup and recruiting patients belonged with study conduct stage of clinical trials. We accept that division and explore the relationship of investigator and patient recruitment as it relates to performance separately.) The operational timelines assess the ability of the clinical study team to meet the planned milestones in a timely fashion. Examples of operational timelines in study startup include activities such as the creation of the project and operations plans, study specific convention (i.e. the definitions for terms in the study such as what ‘high cholesterol’ means in that trial), forms and documents (e.g. case report forms, informed consent forms, or patient enrollment forms). Study startup operational timelines also includes activities such as IT system setup and submitting the appropriate documents to regulatory agencies (IRB forms, FDA forms, or clinicaltrials.gov registration). We recognize that the recruiting timelines are a subset of the operational timelines, but our subjects thought that the recruiting timelines were important enough to justify being assessed separately.
The respondents also felt that the ability to identify qualified investigators was also related to recruiting timelines, but was important enough to be considered separately. Since so much of the success of a clinical trial depends on high-performing investigators, our respondents thought it was important to have an ability to recruit and then engage investigators that would not only be clinically qualified, but also have a stable of eligible patients and the organizational skills to conduct a clinical trial.
Many of our respondents noted that assessing performance on timelines is very contextual. It is important, in their view, to capture expectations for that trial within the assessment. Simply measuring the number of days, for example, that it takes to complete an activity does not capture the context. Consider a trial in which it takes 75 days to recruit subjects. Is this good performance? The answer, of course, is that it depends on the study. Respondents felt that performance assessment metrics should be constructed in a way that accounts for the expectations associated with that particular trial.
Maintaining consistent study team personnel was an important factor for our respondents. Within the context of study startup, this meant that the clinical study team members that began the trial were the same as what was proposed in the planning stages of the trial. Having different people show up at the start of the trial often meant that there would be skills gaps and instability in the team.
Investigator meetings were felt by our respondents to be an important milestone in study startup. These meetings were an opportunity to properly orient the clinicians and their staff to the trial and are a key factor in making sure the trial was conducted according to the specifications.
Finally, the respondents in our interviews identified two key people that drove study startup performance – the project managers and clinical research associates (CRAs). For project managers, it is important that they be (1) knowledgeable about how to conduct clinical trials, the specific details of the client’s trial, and about good clinical practice and regulatory issues and (2) have good interpersonal skills in the sense that provided timely and effective communication, collaborate well, are proactive problem solvers, and have the ability to recommend effective solutions.
CRAs work closely with the clinical sites to assure they adhere to the study protocol, help administrative functions flow smoothly, and to assure the validity of the trial - although aspects of CRA performance vary by the type of study. In addition, there is a paradoxical aspect to CRA performance. High functioning CRAs are adept at integrating themselves into the clinical environment and avoiding disruption. As a result, we used a global assessment of CRA performance.
Figure 1 graphically illustrates the relationships in the drivers of study startup performance that resulted from phase one.
At this point, we organized our insights into a series of questions to assess how the clinical study team performed. Where possible, we tried to organize the question in the order that they were performed over the course of study startup. We were careful to construct the items in a way that was consistent with our respondent’s emphasis and depth of detail. The questions were organized into a survey instrument and edited for clarity. We then asked 10 executives to review the instrument for coverage of important drivers, the level of detail of the questions, and clarity. We then posted the survey online and again reviewed the items for clarity and ease of online presentation.
Phase Two: Statistical Analysis
The purpose of the second phase was to compare all of the drivers in a statistical model to identify which had the most significant and substantial effects on study startup performance. We solicited subjects via email in a purposive sampling using an online survey. Table 1 shows how we measured each of the constructs. We received 65 completed instruments from respondents with an average of 16.5 years of experience in the pharmaceutical industry and 9.5 years of experience overseeing CRO contracts. The trials that the subjects evaluated were from a variety of phases (Phase II-22%; Phase III-35%; Phase IV-43%), specialties (Neuro, Resp, Onc, CV, DM, ID) and regions (North America-80%, Europe-57%, Asia-26%, Russia-28%, and South/Central America-30% [The percentages add up to greater than 100% because many of the trials covered multiple regions]). The overall average of study startup performance was 5.6 (on a scale of 1 – 10) and was lower compared to sales and contracting (6.2), study conduct (6.0), and study closeout stages (6.6).
The statistical approach that we used to test the model in Figure 1 is partial least squares (PLS) path analysis. PLS provides regression coefficients to estimate the relationships between each variable and study startup performance at the average of all the other factors. The coefficient estimate, for example, for adhering to operational timelines is .45. This is interpreted as ‘a 1-unit increase (or 10%) increase in adhering to operational timelines gives a .45 unit (or about 4½ %) increase in study startup performance at average levels of project manager performance, investigator recruiting, etc.’
Several statistical indicators suggest that the model performed well. A substantial number of the paths were significant and the model explained high levels of variance in the important variables (project manager R2 = 81%; operational timelines R2 = 71%; study startup performance R2 = 73%). The two project manager variables (i.e. knowledge and interpersonal skills) requiring detailed indicators demonstrated discriminant and convergent validity. We originally thought that the investigator-related variables (ability to identify qualified investigators, timeline for recruiting investigators, and investigator meetings) would constitute a single ‘investigator’ latent variable. This measurement structure didn’t work well so the variables were broken out as individual contributors to study startup performance.
What We Found
The means, standard deviations, and correlations with study startup performance are shown in Table 2. Three findings are important in this table. First, the performance indicators are fairly low. The average performance across all of the measures was 6.1 – lower than what we expected given all of the attention and effort given to assessing clinical trial performance in recent years. Second, there was wide variation in the performances. The average standard deviation was 2.28, meaning that about ⅔ of the performance scores fell between 3.8 and 8.4 and the other ⅓ of the scores fell outside these bounds. The third finding was that all of the drivers had a positive correlation with study startup performance. While this is not surprising, in that all of our expert respondents in the interviews described these as being important to performance, these positive correlations take on more significance in light of the results of our results, which we discuss next.
Contrary to the results of the interviews and correlations, not all of the drivers had a positive and significant impact on startup performance. Of the seven drivers of study startup performance we tested, two were substantial and significant (operational timelines and project manager performance), two were significant but less substantive (personnel aligned with proposal and adhering to timelines for recruiting investigators), two were significant and negative (ability to identify qualified investigators and investigator meetings), and one was insignificant (CRA performance). These drivers are illustrated in Figure 2.
The explanation for the differences in the results of the correlations and statistical model is that they are different analytical techniques. The correlations only consider the individual drivers in isolation. That is, they only consider the isolated effect of that variable on performance without considering the presence of the other variables. The statistical model estimates the relationship between each of the predictors and study startup performance at the average levels of all the other variables. Adhering to operational timelines, for example, has an estimated coefficient of .45. This means that a 1-unit (10%) increase in meeting operational timelines yields a 4½% increase in study startup at the average of project manager performance, recruiting investigators, etc. As a result, this statistical model better describes the environment in which managers must make decisions about clinical trials.
Adhering to operational timelines had the greatest impact on study startup performance. The operational timelines that drove this impact on performance were the project & ops plans study specific convention, forms and documents, and the regulatory submissions. Adhering to the timelines for the feasibility assessment and the IT system setup did not have a significant impact on performance. These factors that are important in adhering to operational timelines are illustrated in Figure 3.
Project manager performance was another substantial and significant predictor of study startup performance. Both the knowledge of the project manager and their interpersonal skills drove this positive impact on performance. Project manager knowledge was assessed by asking respondents their perceptions about the project manager’s general knowledge of conducting clinical trials (coeff= 10.9, sig= 2.07, p= .04), specific knowledge about the details of this trial (coeff= 15.1, sig= 2.14, p= .04), and their knowledge of GCP and regulations (coeff= 10.8, sig= 1.96, p= .05). Project manager interpersonal skills were assessed by their ability to provide timely and effective communication (coeff= 16.9, sig= 2.53, p= .02), collaboration skills (coeff= 19.0, sig= 2.69, p= .01), proactive problem solving (coeff= 21.3, sig= 2.77, p= .01), and ability to recommend effective solutions for this trial (coeff= 20.5, sig= 2.66, p= .01).
Aligning personnel to what was promised in the proposal (coeff= .24, sig= 2.05, p=. 04) and adhering to the timeline for recruiting investigators (coeff= .22, sig= 2.59, p=. 01) both were significant and positive predictors of study startup performance. We originally modeled all of the investigator variables (identifying, recruiting, and meetings) as a single (formative) latent variable. Instead, we found that each of these variables exerted independent effects on startup performance that were best modeled independently.
Negative Effects on Study Startup Performance
To our surprise (and presumably the experienced managers who suggested these variables), the other investigator variables had negative influences on study startup performance. The ability to identify qualified investigators (coeff= -.18, sig= 2.35, p= .02) had negative and significant effects on study startup performance. This means that for every 1-unit (or 10%) increase in the CROs ability to identify qualified investigators, study startup performance actually declined by about 2%. Our model does not identify a reason for this unexpected finding, but we speculate that the reason might be that an overdeveloped capacity for identifying investigators might inhibit a more collaborative search utilizing the resources of both sponsor and vendor.
The negative effects from investigator meetings (coeff= -.16, sig= 2.13, p= .04) is perhaps more obvious. For clinically active investigators, the meeting is likely to be perceived as intrusive and comes with additional administrative burdens. Negative perceptions increase as more time and effort is devoted to these investigator meetings. Finally, CRA Performance had insignificant effects on startup performance.
While everyone wants a positive start to a clinical trial, study startup has the worst performance compared to any of the other stages of a trial. In this study, we have found that it is possible to isolate those startup drivers (personnel aligned with proposal, operational timelines, adhering to timelines for recruiting investigators, and project manager performance) that are positively associated with study startup performance from those that have no effect (CRA performance) or negative effects on performance (ability to identify qualified investigators and investigator meetings).
We are not arguing that clinical study teams should not try to identify qualified investigators or hold investigator meetings. You have to do these activities and the experience of our manager interviews and the positive correlations confirm that these are good things. When it comes to improving startup performance, however, the four positive variables listed above will change startup performance. This finding is important because executives in the real world must make decisions at the margin. In a world with limited time and resources available to monitor clinical trials, they must decide which variables they should track in order to know how their clinical study team is performing or how a vendor is performing on a contract.
Is there really a need for additional metrics, like those described in this study, when we already have dozens of operational metrics commonly already in use? We believe that the metrics we describe in this paper are a necessary complement to operational metrics, which often lack validity. Imagine that a clinical study manager finds that it took 42 days to recruit investigators. Does that mean that the clinical study team is performing well or performing poorly? The answer is that ‘it depends’ – on the particular type of study or the panel of available investigators, the presence of other large trials in that area, etc. In isolation, it is hard to know what a number like ‘42’ means. Even if you have benchmarks, they will only tell you about average performance in similar trials and not whether those were high performing trials. In summary, it is critical to have performance metrics like those described in this study in order to understand the meaning of operational metrics.
For CROs, who must decide where spend the next dollar to improve performance, the results of this study allow them to prioritize their investments to where they will have the maximal effects (aligning personnel to proposal, operational timelines, project manager performance, and adhering to recruiting timelines) and avoid investments that will have insignificant (CRA performance) or negative effects (ability to identify qualified investigators and investigator meetings). Based on the results of this study, these performance investments can even be fine-tuned to maximizing the return. For example, improving operational timelines (coeff= .45) by 10% will have twice the effect on performance compared to improving the timelines for recruiting investigators (coeff= .22).
In the future, regulators will want to know that sponsors have some rational monitoring function in place.5 The results of this research provide a scientific and validated approach to monitoring study startup.
Michael J Howley PA-C, PhD, Associate Clinical Professor, LeBow College of Business, Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, [email protected]
Peter Malamis MBA, CEO, CRO Analytics, LLC, 6139 Stoney Hill Road, New Hope, PA 18938, [email protected]