OR WAIT null SECS
A model to identify the services and resources sites need to conduct high-quality clinical trials.
Site perspectives of clinical trial quality are rarely heard, even though they are the foundation of the clinical research enterprise. But sites are not alone in conducting trials. They depend on sponsors and contract research organizations (CROs) to provide services and resources that allow them to execute a trial. The critical question from the site’s perspective then becomes: What are the essential resources and services that sites need from sponsors and CROs in order to conduct high-quality trials? The purpose of this paper is to identify such services and resources that are associated with high-quality clinical trials.
This study focuses on three research questions (RQs):
To answer these questions, we conducted focus groups to identify all the activities that are important to sites (RQ1). To evaluate RQ2 and RQ3, we loaded all of these activities, as well as a measure of clinical trial quality, into an online survey. We then solicited Association of Clinical Research Professionals (ACRP) members to evaluate a recent trial as to the quality of the study and then how CROs and sponsors were performing on these key drivers of clinical trial quality. Finally, we created a statistical model to give the appropriate weight to each of these quality drivers.
Trials in which the site had a direct relationship with the sponsor were perceived as being higher quality compared to when the site relied on a CRO. The most critical driver of trial quality from the sites’ perspective was communication-being available for questions, timely responses, and being helpful in resolving problems. While sponsors and CROs are doing reasonable well in this area, improvements in communication will yield the greatest benefits to improving clinical trial quality. Other significant drivers of quality from the sites perspective included the quality of the protocol, budgeting processes, technology, and monitor performance.
The results of this study illustrate a pathway by which sites, CROs, and sponsors can improve the quality of clinical studies. These findings can also provide a starting point for achieving an ROI on quality investments. Perhaps, most importantly, the results of this study give clinical trial sites a greater opportunity to provide insights into quality improvement methods for the clinical research enterprise.
The mission: Improve site support
Clinical trial sites are the underappreciated foundation of the medical research enterprise.1 Despite this importance, sites struggle to execute studies effectively and efficiently.2 But sites depend on sponsors and CROs to support them by providing services and resources that allow them to execute the study plan. This lack of support can contribute to trial delays, increased costs, and considerable site turnover.3 As a result, investigative sites are becoming a scarce resource that limits the ability to conduct clinical research.1
The purpose of this paper is to address this situation by identifying the things that sponsors and CROs do that allow sites to execute high-quality clinical trials. Clinical studies depend on multiple stakeholders-often broken into separate organizations-working together effectively to coproduce a high-quality research program. Coproduction refers to the phenomenon by which multiple actors must apply their knowledge, skills, services, and resources to cocreate a complex service.4 While the site may interact with the patient, delivering a high-quality clinical trial depends on sponsors, CROs, and sites working collaboratively.
But there are potentially thousands of coproduction activities that could impact trial quality. What are the specific drivers that lead to higher quality clinical studies? We distill all of these issues into the three RQs mentioned.
RQ1: What are the coproduction activities that sponsors and CROs provide to sites that lead to high quality trials?
To examine RQ1, we conducted two focus groups of 12 to 15 study coordinators each at the 2015 ACRP national conference in Salt Lake City, Utah. We had no difficulties recruiting potential subjects. Clinical research professionals were anxious to participate. Several of the participants told us that they were delighted they could contribute “so our voices could finally be heard.”
Our approach to these discussions was inductive. The participants all had about 10 years or more of trial experience, so their observations were naturally granular. Each of the sessions was opened by asking, “What are the things (or activities) that sponsors/CROs do that help you execute a high quality study?” This led to a long list of performance activities. The process was iterative as participants clarified and built on each other’s observations. Once we had collated this list of performance activities, we worked with the focus groups to organize all of them into the following eight distinct groupings:
We tested these groupings with focus group members by challenging, for example, whether budgeting and reimbursement were really different groups.5 Once we confirmed that these were distinct groups and aligned the items within each one, we arranged the participants into subgroups of about three people each to organize the items. At the end of the session, each group presented their refinements and the larger group commented on their work.
Representative items-framed in survey format-for each category are shown in Table 1 (click to enlarge). We also included a set of questions to assess the quality of the clinical trial from the site’s perspective, which served as the dependent variable, shown at the bottom of Table 1 and derived from the SERVQUAL measure.6,7 All of these items were edited for clarity, combined with similar items when appropriate, and loaded into the CRO Analytics’ Performer platform in preparation for a survey.
RQ2: How do sites evaluate sponsors and CROs on these key drivers of quality?
We solicited responses from the ACRP membership, excluding those members who participated in the focus groups. Respondents logged on to the Performer tool. They were instructed to think about a Phase II or III trial that completed recently and then to evaluate the items with that study in mind. We also asked them about characteristics of the trial and demographic items.
We received a total of 278 responses from experienced research professionals. The respondents had an average of 10.7 years’ experience. About 94% of respondents characterized themselves as clinical trial coordinators or clinical research managers, with 3% describing themselves as principal investigators and 3% using a variety of other job descriptions (e.g., consultant). The trials involved various therapeutic areas. Cardiology only made up 8% and oncology 3% of the trials. Most of the studies were Phase III (75%); the rest (25%) were Phase II. The average trial consisted of 70 subjects (sd= 495), with a long right tail. Eighty percent of the studies in this sample met their enrollment goals, higher than typically reported.8
How did sites evaluate the overall quality and performance in the coproduction activities? The average overall perceived quality of the clinical trials was 7.1 (sd= 2.17) on a scale of 1 to 10 (1 = low quality; 10 = high quality). This average is high compared to similar studies we have conducted on clinical trial quality. Figure 1 below illustrates the overall quality ratings with the ratings for each of the performance areas. This graph is constructed so that quality is the first column on the left and then performance areas are ordered from least to greatest (left to right). Each of the performance areas demonstrated discriminant validity from the others, meaning that we demonstrated statistically that budgeting was distinct from reimbursement and all the other groups.
It is not surprising, given our focus group discussions, that budgeting, reimbursement, and monitors were the lower-ranking performance areas and that communication (µ= 7.2) and protocols (µ= 7.7) were rated more highly. Given these averages, we next sought to identify the amount of variation with each of these averages. Are these ratings consistent, or is there wide variation around this average? Greater variation means that there is less consistency or agreement about the quality and performance ratings. We screened for variation by looking at the standard deviations. A high standard deviation would indicate significant variation in the responses around the average. Figure 2 below illustrates the findings on variation and standard deviations. Items with the highest standard deviation scores have the highest amount of variation (or less consensus) around that average (i.e., they are likely to have subgroups of very low or very high scores). Items with the lower standard deviation scores have less variation (greater consensus). Communication (µ= 7.2) has a relatively high rating, but there was also a lot of underlying variation (sd= 2.53) around that rating. This suggests that while many sites are rating communication very high, there is a lack of consensus about this performance metric.
The findings regarding the protocols were more complex. In examining the means and variation of the ratings of the protocol sub-drivers, we found that some of the important ones-complexity of the protocols and the frequency of amendments-had low ratings with high variation, or a lack of consensus. This is a paradoxical finding: even though the sites rated the overall quality of the protocols highly, they thought they were exceedingly complex, with excessive amendments. We also found that the correlations between overall protocol quality to the complexity (r= .05) and amendments (r= .26) were very low.
There is a disconnect between how sites evaluate the quality of a protocol and its complexity and amendments. We interpret these findings as an example of lowered expectations. It may be that sites have simply accepted the complexity of protocols and frequent amendments as a norm, so it does not impact their ratings of the overall quality of the protocol (see Figure 3).
Finally, we found that sponsor companies outperform CROs in overall quality and many performance areas, as shown in Figure 4 below. Labels that have a single asterisk (*) were found to be statistically significant at p= .05 in a multivariate ANOVA (analysis of variance) and the labels with a double asterisk (**) were significant at p< .001. We see two possible explanations here based on our focus group discussions. It may be that CROs are conducting a wide variety of studies with many different sites, thus an individual site does not feel like it is getting the support or attention from the CRO. It may also be that the CRO can be perceived as a “middle-man” that blocks or delays the site from the getting information or support it needs to execute the study. Further research is needed on this question.
RQ3: How do each of these coproduction activities drive clinical trial quality?
To this point, we have assumed so far that each of the performance drivers all have an equal impact on quality. To drill down further and weigh the relationships between overall quality and the eight performance areas, we created a multivariate linear regression model. The sites’ perception of the quality of the trial was the dependent variable and the performance areas were the independent variables. We also included a set of
covariates-variables that might influence the relationship between trial quality and performance-in the model. The covariates were the size of the trial (number of subjects); whether it met its enrollment goals and endpoints; the complexity of the protocol; the appropriateness of protocol amendments; whether the site’s contract was with a sponsor or CRO; and the length of the experience in clinical trials of the respondent.
The regression model explained a statistically significant (F(15,244)= 48.5, p< .001) amount of variance (R2= .75) in clinical trial quality. The model met all of the regression assumptions and variable were centered to enhance interpretation. The coefficients with their statistical significance tests are shown in Table 2 (click to enlarge).
We compare the magnitude of the coefficients graphically in Figure 5 below. In interpreting the coefficients, each of the variables was mean-centered before running the model. Each coefficient gives an estimate of the unique impact of that variable on trial quality at the mean of all the other variables. Also, the variables for total subjects and years’ experience were skewed to the right and, thus, were log-transformed.
The impact of the performance drivers on clinical trial quality fall into three distinct layers. In the top, most impactful layer, communication (b= .29, p< .001) and protocol quality (b= .24, p< .001) had the single greatest effects on quality in this study. Communication in common usage is usually a very broad term. The results from our focus groups offer drill-down insights. In the focus groups, sites mean very specific things when they speak of communication. When a study team is communicating well, that means that they provide information in a timely fashion; are available for questions as they arise; and are available and helpful in resolving problems. The impact of protocol quality is particularly impressive because we included protocol complexity and amendments as covariates in the model. Protocol quality has an impact on trial quality above and beyond complexity and amendments.
In the middle tier of performance drivers, budgeting (b= .16, p< .001) and technology (b= .15, p< .001) had statistically significant and positive-although more moderate-impacts on trial quality. Budgeting refers to the process of establishing the resources for the site’s services. Within this category, fairness was the dominant theme. The recurrent theme that we heard in the focus groups was that site coordinators simply wanted to be paid for the requested work. They become very frustrated when the site must absorb the cost of adjusting to protocol amendments, training new monitors, handling serious adverse events (SAEs), or just budgeting extra time for the coordinator to do the work demanded by the protocol. Notice that this budgeting driver is distinct from the actual reimbursement, which was not a significant driver.
Within the technology areas, sites often struggle with multiple clunky software systems that are not integrated. On average, attendees at the focus groups had to separately log into six or seven technologies for each trial. Sites want seamless, integrated trial technologies and improving technology will have significant and substantial effects on perceived clinical trial quality. Improving technology will have a significant, though moderate, impact on trial quality.
The performance of the monitors (b= .10, p= .02) had a slightly less substantial, but still significant impact on clinical trial quality. Sites described in the focus groups that they want monitors who can serve as a resource to help them execute trials better and more efficiently. Inexperienced monitors-that the site must train-who don’t understand the protocol and are disruptive to site operations are an ongoing frustration for sites.
Site initiation (b= .06, p= ns), closeout (b= .05, p= ns), and reimbursement (p= -.05, p= ns) were not significant performance drivers of quality in this study. Even though closeout has a negative coefficient, it is not statistically different from zero. In understanding these results, remember that this model estimates the unique effects of each performance driver, exclusive of all the other performance drivers. So if it seems like initiation (or closeout or reimbursement) should have been significant, remember that we are estimating the isolated effects of initiation on quality, excluding the effects of communication, the protocol, budgeting, technology, and monitors.
Sites are the foundation of the clinical research enterprise, but they have surprisingly little input in the development or planning of trials. The purpose of this research was to provide sites the ability to give their perspective of how we can improve the quality of clinical trials. The recurrent theme that we find in both the focus groups and statistical analysis is that sites are looking for partners who can help them serve their patients and conduct excellent science. Their common experience is that they often struggle to be treated fairly.
The results of this research may not surprise those who work with sites. This data has been anecdotally available for some time. The real contribution of this study is not only in the systematic collection of all of the performance drivers, but, more importantly, in giving weights to each of these performance drivers. In doing so, we are able to think about an ROI on site relations.
Suppose you are a sponsor or a CRO that wants (or needs) to improve site relations. The results of this study provide a way for sponsors and CROs to assess where they stand compared to the rest of the industry. Companies can send out a survey and compare their results to the data illustrated in this report. Based on the
results of their survey, companies can think about the ROI on investing to improve their scores using the results of the regression model.
Suppose a company’s survey shows that it rated a 6.5 on a 1-to-10 scale for communication and a 7.0 on overall quality. Congratulations-not a bad score. But if one looks at our data, they will see that our sample rated an average of 7.2 for communication. This means that the organization would have room to improve in this area. Is it worth investing to improve communication? To assess this question, examine the results of the regression. Communication had a coefficient of .29. That means that if a sponsor or CRO invests in communication and is able to raise its communication rating from 6.5 to 7.5, it would improve its quality score from 7.0 to 7.29. What specifically does one look at to improve communication? Examples of the sub-drivers are found in Table 1.
Now imagine that a company’s survey shows a 5.5 for closeout and it is compared to our findings of 7.4. The rating gap is fairly significant. Should a company, therefore, invest to improve its closeout ratings? The results of our regression model would suggest no. The closeout driver had no significant effect on quality, so the results of this study would suggest an organization’s investment would be wasted. The firm certainly might make some non-monetary investments using guidance from Table 1, but it would not improve quality by investing heavily in this area.
Several limitations should be kept in mind in evaluating the results of these studies. First, some might argue that the ACRP membership is not a representative sampling. Someone who belongs to a professional organization like ACRP, attends the national conference, and even volunteers for a focus group is likely to be more engaged in their professional work that the average site coordinator. While this is possible, we would also point out that these are also likely to be key opinion leaders within the profession-the very group that sponsors and CROs need to reach in order to improve clinical trial quality.
We would also note that the model estimates are general across a variety of therapeutic areas and phases of clinical trials. Although we would argue that these estimates would generalize across clinical studies, estimates may vary around our means within specialty areas.
Michael J Howley, PA-C, PhD, is Clinical Professor, LeBow College of Business, Drexel University, email: firstname.lastname@example.org; Peter Malamis is CEO, CRO Analytics, LLC, email: email@example.com; Jim Kremidas is Executive Director, Association of Clinical Research Professionals (ACRP), email: JKremidas@acrpnet.org
* Acknowledgement: The authors wish to thank Jenna Rouse, Director of Business Development, ACRP, for her assistance in conducting this study.
Endnotes and References
1. Califf, Robert M. “Clinical Research Sites - the Underappreciated Component of the Clinical Research System.” Journal of the American Medical Association, (2009), 302(18), 2025-2027.
2. Rhymaun, Farouk, “Sponsor - Investigator Relationships: A Crisis of Trust.” Applied Clinical Trials; published September 23, 2014. Available online at http://www.appliedclinicaltrialsonline.com/sponsor-investigator-relationships-crisis-trust-life-sciences.
3. Getz, Kenneth A. “Lifting Up a Fragmented Study Conduct Landscape.” Applied Clinical Trials, (2013), 22(7/8), 22-24.
4. Scholars recognize coproduction as being pervasive across all service industries. For a fuller explanation, see Zeithaml, Valerie, Mary Jo Bitner, and Dwayne Gremler, Services Marketing, 7th Ed. (2018) McGraw-Hill, p. 351.
5. We later statistically confirmed the discriminant validity of these groupings.
6. Parasuraman, Anantharanthan, Valarie A. Zeithaml, and Leonard L. Berry. “Servqual: A Multiple - Item Scale for Measuring Consumer Perceptions,” Journal of Retailing, (1988), 64(1), 12 – 17.
7. Parasuraman, Anantharanthan, Leonard L. Berry, and Valarie A. Zeithaml. “Refinement and Reassessment of the SERVQUAL Scale.” Journal of Retailing, (1991), 67(4), 420-426.
8. Getz, Kenneth A, “Enrollment Performance: Weighing the “Facts,” Applied Clinical Trials, (2012), 21(5), Available at http://www.appliedclinicaltrialsonline.com/enrollment-performance-weighing-facts.