Quality Metrics for Clinical Trials

July 8, 2015

A case study by Pfizer & The Avoca Group on the development of an overarching strategy for Pfizer’s clinical trial quality metrics.

Finding ways to effectively and efficiently manage quality is one of the central issues faced by clinical development teams today. A great deal of effort is devoted to associated processes, systems, and metric collection and analysis. However, the industry still struggles when it comes to measuring quality, making it difficult not only to ascertain whether adequate quality is being achieved, but also to verify that processes designed to promote quality are doing so. This article presents a case study in which Pfizer and The Avoca Group collaborated on an in-depth review of Pfizer’s quality metrics and related procedures, with the goal of implementing a framework for the oversight and analysis of clinical trial quality.

At the outset of the project, Pfizer’s strengths in this area included: a comprehensive set of existing metrics; an infrastructure for metric aggregation; defined responsibilities for process-oriented metrics; strong senior management support for the idea of a more efficient approach; and a high level of motivation throughout the organization. Issues to be dealt with as part of the consulting project included the following:

  • The existing catalog of “quality metrics” focused on process/“compliance” metrics (“was a task performed?”), with relatively little representation of “outcome” metrics (“did we achieve adequate quality?”)

  • There was no clear definition of “quality metrics” that differentiated these from “operational metrics.”  

  • The metrics were not organized into a conceptual framework that provided an understanding of the purpose of each metric, the expected relationships between metrics, and the level(s) at which each metric should be reviewed and actioned, nor did the process for introduction of a new metric involve a systematic evaluation of any of these factors. 

  • Numerous, and often unsuitable, metrics were being reviewed by too many parties, resulting in a burdensome and inefficient review process. For example, since most metrics were presented only at the study level, reviewers beyond the study management level (e.g., program and resource management) found it challenging to access metric data in a manner that directly spoke to their accountabilities. 

  • Methods for continuous data analysis that would identify meaningful trends in metrics, as well as relationships between metrics, were underutilized. For example, there was no systematic approach to examining either the fit-for-purpose of the metrics themselves (do they predict what they should?) or the rationale for action thresholds.

  • Given Pfizer’s large quantity of metric data, the underdeveloped analytic strategy represented substantial untapped potential for predictive modeling and ultimately, for process/resource optimization strategies.

To address these issues, the Avoca-Pfizer team developed an overarching strategy for Pfizer’s quality metrics, the principle components of which are discussed in this article. An implementation plan and technology assessment related to that strategy were also developed but are not discussed here. 

 

Framework for organization of quality metrics

To meet Pfizer’s needs, the Avoca Group sought to develop a framework for the organization of Pfizer’s quality metrics based on the fundamentals of the definition of quality. This framework was intended to define a core set of quality outcomes that required management, and then to classify Pfizer’s metrics such that each bore a defined relationship to one or more of these outcomes. Classification of metrics into this framework was intended to:

  • Ensure that Pfizer’s suite of quality metrics contained the most important things that needed to be measured

  • Expose excess or poorly designed metrics (i.e., those that weren’t truly quality metrics but rather operational metrics or that weren’t designed in a fit-for-purpose manner)

  • Promote the development of more targeted and rational review procedures, with clear drill-up (to “upstream” metrics) or drill-down (to “downstream” metrics) cascades for analysis of status/issues and for proactive (e.g., risk) management.

 

 

An added benefit was that the framework would also provide a structure for organizing other (non-metric) quality-related findings (e.g., textual information about SOP deviations, audit findings, etc.), thus allowing various sources of information to be logically assembled so as to promote comprehensive evaluation of quality management findings.

The first task in developing this framework was to define quality metrics in a way that distinguished these from other types of metrics (e.g., operational), and thus set bounds around the framework and the catalog. Within the industry, there exists no standard in this respect; quality metrics may be variously defined as those that measure: performance to a quality standard (outcomes); activities that influence or predict quality (generally operational activities); quality assurance operational activities; and/or compliance with quality-related regulations. To ensure a clear distinction between quality and operational metrics, Pfizer elected to define the term quality metrics relatively restrictively, to include only those that: (1) measured quality outcomes, or (2) had been shown statistically to influence quality outcomes. 

Figure 1. Top-Level Quality Outcome Categories

To develop the framework, The Avoca Group began with the CTTI definition of quality; “The ability to effectively and efficiently answer the intended question about the benefits and risks of a medical product or procedure while assuring patient safety and protection of human subjects” (definition from October 2008 presentation on CTTI by Dr. Rachel Behrman, CTTI co-chair and then associate commissioner for clinical programs, FDA). From this definition, seven fundamental categories of quality outcomes were derived, as depicted in Figure 1(above). The top four categories in this figure represent the key components of any quality scientific experiment, and the two just below these represent ethical requirements associated with performing experiments on human subjects. Finally, when experimental results will be submitted in support of marketing applications, regulatory authorities must be satisfied that the activities leading to quality in these six outcome areas have been effectively managed, and thus require that certain types of documentation be maintained and/or submitted. Thus, the final category reflects the need for compliance with regulatory requirements in this respect.

Table 1. Outcome Components for Each Major Quality Outcome Category

Each of these major categories was then broken down into a set of specific, actionable outcome components, as described in Table 1(above). The seven major outcome categories and their associated components thus defined the complete top level of the framework, in the sense that all of Pfizer’s quality metrics must be in some way related to the achievement adequate standards for one or more of these outcomes.

With respect to their purposes in this area, quality metrics were then defined as belonging to three different types: 

  • Outcome metrics directly answer the question of “Did we achieve the level of quality required by our standards?” 

  • Predictor metrics answer questions of “How are we progressing in our attempt to meet our quality standards?” Often, predictor metrics will be “fractional values” of an outcome metric (e.g., percent of subjects with protocol violations in the study “so far”). Otherwise, to qualify as a predictor metric, a statistical relationship between a predictor metric measured before or during a study and a quality outcome at the end of the study must be established. 

  • Contributormetrics measure conditions and activities that are logically expected to contribute to quality outcomes, but that do not have a strong enough statistical relationship to qualify as predictors. These metrics are most commonly process-related, addressing questions of, “Are we doing what we believe we need to do to achieve adequate quality?” Note that contributor metrics are not considered quality metrics according to Pfizer’s definition, but rather operational metrics with defined links to quality metrics. We, nevertheless, mention them here, because contributor metrics will often be reviewed when individuals involved in quality management are attempting to identify the root causes of quality issues, or when they are managing quality risk. When predictor or outcome metrics are not meeting expectations, such individuals will drill down to the logically related set of contributor metrics in attempt to identify specific inadequately performing process and/or resource areas that may be contributing to failure at the outcome/predictor level.

Therefore, for each of the major outcome areas there are three discrete levels of metrics, allowing each metric to be assigned to a cell in a network such as that depicted in Table 2(seen below).

Table 2. Structure of Quality Metric Taxonomy

To maintain organizational discipline in adhering to this framework, it was recommended that a metric lifecycle management function be established to oversee organization of Pfizer’s quality metrics into the framework, as well as to oversee the performance of each metric. Specifically, this function would have the following remit:

  • To own the quality metric framework, identifying and approving any proposed changes to the framework itself

  • To approve any proposed new metrics, and to classify these metrics within the taxonomy

  • To approve associated thresholds (e.g., stoplight), review plans, and metric validation plans, in consultation with relevant experts

  • To assess the need for change in or retirement of any metrics

  • To serve as a central body for the coordination of metric-related activities across different parts of the organization, for example, those involved in defining and collecting operational metrics.

 

 

Review and analysis of quality metric data

As is the case with any information, the value of quality metric data, no matter how well organized, is limited by the effectiveness and efficiency with which it’s used. Thus, one of Pfizer’s objectives for this project was to improve the effectiveness and efficiency of its metric review process, by focusing each stakeholder on routine review of only a concise, targeted set of quality metrics that would answer the specific quality-related questions pertinent to that person’s accountability. A rationally pre-designed expanded review of additional metrics was then recommended to investigate root causes and/or scopes of impact if emerging issues were identified from the routine review, and/or based on known, a priori risks for a particular study. 

For the purpose of defining the quality-related questions to be answered and, hence, both the routine review metrics and the expanded review metrics, Pfizer’s stakeholders were divided into four basic categories, as depicted in Table 3. This table describes how each of these role categories are defined, as well as the questions that individuals in each role are responsible for answering via quality metrics. Some individuals may have multiple responsibilities and may, therefore, need to consider the questions associated with more than one role, in their metric reviews.

Table 3. Categories of Stakeholders, for the Purpose of Metric Review

As can be seen from Table 3, a study director’s review of quality metric data must focus on ensuring, during its course, that his/her study will meet the expectations set by quality standards. Thus, the study-level reviewer will focus his/her routine metric review on predictor metrics for quality outcomes, potentially along with a set of rationally selected contributor metrics that may be dictated by the risk profile for the study (i.e., protocol complexity, region, outsource partner, etc.) If quality problems emerge, a study-level reviewer will then “drill down” to the lower level (i.e., contributor) metrics that are associated (in the framework) with the problematic predictors, as well as examining cuts of the metrics by resource, etc., in order to define the scope and isolate the root causes of the issues.

In contrast, higher-level business unit and resource leads will focus on understanding the performance, from a quality perspective, of the domains for which they have accountability. Their routine reviews will thus focus primarily on outcome and predictor metrics for their domains (e.g., % of studies in their domains achieving quality standards), again with access to lower-level (contributor) metrics and to “cuts” of the metrics, in order to evaluate the scopes and root causes of any quality problems noted. Resource-level reviewers will also be interested in contributor (process-level/operational) metrics, in order to evaluate compliance of each resource with quality-promoting process commitments.

Finally, process leads will review quality metrics with an eye to developing quality-promoting processes that can be complied with, and that when complied with, will reliably lead to the achievement of quality outcome standards. Thus, their reviews will focus primarily on understanding the relationships between contributor (process-level) metrics and associated outcome metrics, as well as on assessment of process compliance.

While routine and expanded reviews by stakeholders will serve to identify and direct action on immediate issues, a higher level of analysis of quality metric data may be applied in order to achieve broader and more ambitious corporate goals, particularly when significant quantities of metric data are available. For Pfizer, one of the team’s objectives for this project was to define additional ways that Pfizer could enhance its use of metric data to these ends. Specific analytical techniques were recommended to serve four major purposes:

  • Multivariate analyses to identify complex patterns across resource groups and/or business units (i.e., beyond what could be identified through simple cuts of the data), to support root cause analysis of ongoing problems

  • Regression analyses to assess the efficacy of specific remediating actions in improving quality, and reciprocally, to investigate potential negative collateral effects of operational or organizational changes. For example, such analyses could be used to look broadly across studies/circumstances to determine whether a given type of remediating action is generally effective in improving quality, and even to get a sense for how long improvement generally takes to be realized.

  • Regression and other relational analyses to verify that each metric achieves its intended purpose (i.e., that a predictor metric actually has downstream predictive power), and to establish rational thresholds for action (i.e., red/yellow/green lights), by understanding the nature and shape of the relationships between predictor and outcome metrics

  • Multivariate analyses to characterize the relationships between baseline characteristics of a study (i.e., protocol complexity, region, outsourcing model, etc.), predictor metrics, other (e.g., operational, cost) metrics, and quality outcomes, in order to permit predictive modeling and optimization of oversight activities, resource allocation, processes, etc., for each circumstance. 

 

 

Implementation at Pfizer

Implementation of the strategy began in Q1, 2015. The following work streams were established:

  • Taxonomy – Aligned all metrics and non-metric quality data to the quality taxonomy and proposed their classification as outcomes, predictors and contributors.

  • Review – Developed stakeholder profiles regarding what quality information had to be reviewed, and established the drill downs and the cadence (review process); guided the development of reports and reporting tools.

  • Analysis – Created the quality analysis plan, established thresholds for quality metrics, created standard analytic reports, developed predictive analytics and risk models, and performed statistical analyses on contributor/predictor/outcome relationships.

  • Communications – Identified stakeholders for communication and formulated key messages to disseminate; controlled change management.

  • Technology – Identified and implemented appropriate solutions to ensure consistent high quality data and reporting.

  • Governance – Chartered to manage the metric lifecycle for the quality metrics as well as to independently ensure that each metric as designed achieves its intended purpose.

Classification of existing metrics into the taxonomy allowed for identification of areas that were overserved or underserved with metrics, creating immediate opportunities to recommend adjustments to the catalogue. The taxonomy also created clarity in the intended relationships between contributor/predictor/outcome metrics, and Pfizer’s collection of clinical trial data offered an adequate sample to statistically test widely held beliefs about these relationships. Through this process, a considerable number of existing metrics were identified that did not contribute to quality outcomes. This will lead to the ability to retire or redesign existing metrics that do not actually bear strong relationships to quality outcomes.

Initial analytic models based upon the taxonomy confirmed that many of the contributor metrics reflected good performance, which masked the more important performance information found in the predictor and outcome metrics. Since prior to the taxonomy all metrics were considered equivalent in importance, the more numerous contributor metrics influenced the overall enterprise performance picture positively, even though the positive performance on contributor-level metrics did not always lead to positive quality outcomes. This was a key finding of the exercise.

The expected timeline for implementation of the taxonomy is six months. This includes an integrated technology platform, allowing Pfizer to better comprehend its clinical trial quality.

Denise Calaprice-Whitty is Senior Consultant, The Avoca Group; Jonathan Rowe is Head of Clinical Quality Performance Management, Pfizer