OR WAIT null SECS
Today’s rich oncology pipeline-accounting for nearly 25% of agents in clinical development-promises much needed advances in cancer therapy.
Today’s rich oncology pipeline-accounting for nearly 25% of agents in clinical development-promises much needed advances in cancer therapy.1 That promise dims in the face of other discouraging statistics: only 7% of oncology agents entering Phase I clinical trials gain marketing approval2 while only 34% of Phase III oncology trials achieved statistical significance in primary endpoints.3
The cost, time, and numbers of patients required to conduct conventional oncology clinical trials continue to escalate. The complex demands of evaluating new targeted therapies add to this burden. Novel methodologies are available that make trials more efficient and informative so that precious resources of patients, time, and money are invested in studies with the greatest chances of success.
Adaptive trial design offers opportunities for improvement by shortening the time needed to answer key research questions, reducing the number of patients needed for evaluation, and improving the quality of decision-making to increase overall success rates. The use of adaptive designs also raised scientific and regulatory questions that slowed adoption by the biopharmaceutical industry. A growing body of experience culminated in the U.S. Food and Drug Administration’s (FDA) 2010 draft guidance, Adaptive Design Clinical Trials for Drugs and Biologics, which details adaptive approaches and encourages their use.4
FDA defines an adaptive study as one that “includes a prospectively planned opportunity for modification of one or more specified aspects of the study design and hypotheses based on analysis of data (usually interim data) from subjects in the study.” Five adaptive designs-including blinded sample size re-estimation and halting early for lack of utility-are cited as “well-understood.” FDA encourages drug developers to use these approaches for all studies. Seven “less well-understood” designs-including unblinded applications that use interim estimates of treatment effect for endpoint selection and sample size re-estimation-should be reserved for exploratory studies while more experience is gained.
This regulatory underpinning supports wide application of adaptive design in oncology drug development. Its positive impact can be seen in the groundbreaking I-SPY 2 breast cancer trial, which uses adaptive design to streamline identification of active drugs and predictive biomarkers.5 I-SPY 2 (“Investigation of Serial Studies to Predict Your Therapeutic Response with Imaging and Molecular Analysis”) suggests a model for new, adaptive design-based approaches to advance the oncology drug development process.
Traditional designs contribute to high failure rates and escalating costs because answers to pivotal research questions are obtained only at the end of the trial. Trials using fixed designs rely on assumptions that may be found to be incorrect at the end of the study. Faulty assumptions used in underpowered Phase I and Phase II trials yield poor information on which to base decisions about Phase III designs where the impact of failure is greatest due to the large number of patients and time involved. The cumulative effects of the traditional approach are low overall success rates and high costs (Table 1).
Advancing oncology drug evaluation depends on: 1) selecting the best drug candidates; 2) identifying and eliminating failures as early as possible; and 3) designing trials to identify the right dose, for the right disease, in the right patients as early as possible. With thousands of potential drugs awaiting development-and with relatively few of these likely to demonstrate efficacy-earlier information and better-focused evaluation are critical to improving success rates. Adaptive trial designs are especially well suited to this purpose.
Adaptive designs leverage accumulating data to modify trials as they progress, making better decisions at each sequential step. Adaptive approaches use early findings to improve the next phase in a flexible process that can accelerate timelines, reduce costs, and generate the most knowledge from the smallest number of patients.
Traditional designs use a probabilistic statistical approach. Decisions regarding dosage, randomization, and sample size are made in advance and usually do not change throughout the trial. Instead of making pivotal decisions with limited information before a trial, adaptive designs use accruing information to obtain relevant data that inform and improve critical decisions. Data are analyzed continuously or at designated interim points, and results are used to shape future design parameters such as doses, disease indications, or populations being studied. Using this flexible approach, the trial becomes a learning tool that applies evolving knowledge to drive subsequent decisions.
Adaptive designs can incorporate more than one adaptation in a trial and may address a number of research questions simultaneously. A single trial can be designed to evaluate multiple dose regimens, indications, drug combinations, and even multiple drugs.
For example, a seamless Phase II-III breast cancer trial might include adaptive approaches to stop early for futility, assess dose response, drop or add arms, change the proportion of patients randomized to each arm, and enrich the patient population with subjects most likely to respond. Table 2 lists eight adaptive settings commonly used in drug development and particularly relevant for oncology trials.6
Bayesian statistics in adaptive design. Adaptive designs often use Bayesian statistical methodology to model complex scenarios. In Bayesian approaches, statistical models require the formulation of a set of prior distributions for any unknown parameters, in addition to the parts of the model based on the traditional probability distribution of observations. Multiple sources of information are combined to make inferences, allowing researchers to test assumptions based on both direct observations and additional information on neighboring doses, different populations, similar compounds, preclinical modeling, genetic targeting, and historical data. Repeated analyses can be conducted within a study-and even across studies-using sequential analysis techniques. Results can be used to inform the design of the current trial.
Simulation informs optimal design. While fixed designs depend on theoretical justification of trial behavior, adaptive designs are more complex and depend heavily on simulations to understand trial behavior, efficiencies, and risks as inputs to inform and optimize trial design. Depending on the phase and design, regulators may require submission of simulation results to justify the scientific credibility of an adaptive trial4, particularly if the data is intended to support a regulatory approval. Specialized simulation software, such as FACTS, is available to assess key performance characteristics including power, Type 1 error, bias, and average sample size.7
Biomarkers provide early information. Biomarkers are important in adaptive designs to provide early measures of efficacy. Since early data may be used to modify a trial as it progresses, the traditional long-term oncology endpoints of survival and progression-free survival are of less benefit. To satisfy this purpose, biomarkers do not need to be validated surrogates. Berry notes that early findings based on “auxiliary markers (that) might be correlated with, and predictive for, the primary end point … may be incorporated into the trial design to help guide the adaptive aspect of the design.”7 Useful markers might include early clinical outcomes (such as imaging, response, and progression), serum markers, or molecular markers from tumors via biopsies. In a provocative article, Verweij suggests that functional target pharmacology studies followed by proof-of-concept studies could replace traditional Phase I, II, and III trials, given that early tumor shrinkage-as measured by Response Evaluation Criteria in Solid Tumors-still appears to be the most reliable biomarker.8
The primary goal in Phase I is to determine maximum tolerated dose (MTD) for the experimental agent. Over- and under-estimation of the true MTD is a common problem in oncology trials, most of which identify MTD using the “3+3” method. An emerging adaptive approach, called the Continual Reassessment Method (CRM), yields more precise MTD determination and increases the likelihood that the true MTD is used in Phase II.
Traditional 3+3 method. In the 3+3 method, dose escalation steps are defined prior to the trial. A cohort of three subjects receives the drug at a starting dose based on preclinical data. If no toxicity is observed, another cohort of three subjects is added and the dose is escalated to the next level. If one of three subjects experiences dose-limiting toxicity, another three-patient cohort is added at the same dose, and dose escalation continues. If any additional toxicity is observed, the lower dose is declared to be the MTD.
A 1999 analysis reported that when using the 3+3 method, “the probability of recommending the (correct) MTD at the end of the trial … never exceeds 44% and is most often closer to 30%.”9 Poor MTD identification is attributable to the tendency to select larger incremental “jumps” in order to observe toxicity more quickly in fewer steps. The true MTD often resides in a smaller incremental dose and is not observed.
Adaptive CRM design. The Continual Reassessment Method pinpoints the true MTD more precisely by efficiently evaluating more dose levels. CRM models the probability of the MTD as a function of dose at each dosage level and continuously refines it. The 3+3 method bases the next dose allocation (and, therefore, the level that will eventually be declared the MTD) on the last cohort of subjects, while ignoring the data from the previous cohorts. CRM uses all the data to update the estimation of the MTD and to allocate the next patients, either in cohorts or continuously. The model is frequently updated and improves with accruing data.
In the majority of cases, CRM yields better estimation of the MTD and can allow for more rapid progression through early dosing levels depending on the operating characteristics and rules that are established in the design. Although the CRM approach is more complex and requires high levels of modeling and simulation, experience has proved its value in identifying true MTD with a higher level of confidence. As shown in Figure 1 adapted from Parke, CRM is better than 3+3 at identifying the correct dose level in nine of the 10 scenarios presented. In Scenarios 1, 3, 4, and 6, CRM was substantially better, providing a 10% higher probability of identifying the MTD than the 3+3 method. In Scenario 2, the CRM and 3+3 approaches yielded very similar results.10
Additional CRM benefits. Parke cites additional advantages of CRM: “Unlike the 3+3, its operating characteristics can be easily optimized in light of the current circumstances, different levels of toxicity can be targeted, different cohort sizes used and different levels of accuracy required before stopping, offering better determination of the MTD at the cost of greater sample size.”10 Seamless Phase I-II trials can be designed to allocate subjects based on continuing information on both tolerability and efficacy, an approach that shortens timelines. Another benefit is that patients involved in dose determination may continue to participate in activity evaluation-an important advantage from an ethical point of view.
Slow adoption of CRM. Despite current literature demonstrating the superiority of CRM in determining the MTD, most Phase I and Phase I-II oncology trials continue to use the 3+3 method, likely based on sponsor and investigator level of familiarity. Our search using the key words “adaptive,” “Bayesian,” “CRM,” “3+3” and “escalation” found a total of 12 Phase I and Phase I-II dose escalation trials published in The Oncologist (four trials) and the Journal of Clinical Oncology (eight trials) from August 2012 through August 2013. All 12 trials used the 3+3 design, confirming the 2013 review by Ji and coworkers, which reported “... more than 95% of Phase I studies have been based on the 3+3 design.”10
Improving dose-response evaluation. Adaptive designs can be used to efficiently evaluate several active doses in Phase II without necessarily increasing the sample size. Evaluation of more active doses provides a better understanding of the dose-response relationship, reducing the likelihood of failures due to suboptimal dose selection in Phase III. Ineffective or unsafe dose levels can be discontinued early, and the majority of patients can be allocated to the dose levels most likely to be active.
Improving identification of target populations. Increasing genomic knowledge of cancer subtypes is driving the need for efficient drug evaluation in targeted patient populations. The milestone genetics study of breast tumors published in 2012, for example, identified four distinct subtypes of breast cancer, suggesting targets for new drugs and better uses of existing drugs.11 As noted by Esserman and Woodcock, “The inability (or lack of explicit effort) to identify and incorporate specific disease subtypes into trial design inhibits the development of more cost-effective drugs that target specific populations,” a dilemma that demands new clinical trial designs that can address disease heterogeneity and complexity.5
Adaptive Phase II designs can be instrumental in identifying the appropriate patient population for Phase III evaluation. Identification of the right subpopulation can have a dramatic impact on the number of patients required in Phase III trials to demonstrate efficacy. For example, suppose one half of subjects with non-Hodgkin lymphoma respond well to a drug, as measured by a 60% hazard ratio; the other half benefit by only 10%. To show superiority in a Phase III trial with all patients enrolled at 90% power, 530 events would be required. But in a trial with the subpopulation of more positive responders, only 210 events would be needed.
Halting for futility. Preplanned futility analysis based on interim data can be used to stop a study that is unlikely to meet its primary endpoint. Interim futility analysis also can allow developers to continue a study with greater confidence of success in Phase III. For example, a simple preplanned futility analysis was conducted in a Phase III multicenter study comparing a new therapy to standard of care in patients with progressive and/or recurrent non-resectable glioblastoma multiforme. The target sample size was 323 randomized patients. Recruitment was difficult; after three years, only 137 patients were randomized. An unblinded interim futility analysis indicated that the therapy was unlikely to demonstrate efficacy. Based on the analysis, the independent data monitoring committee recommended halting the trial. Early termination avoided unnecessary exposure for approximately 180 subjects.
Halting early avoids Phase III failures that contribute significantly to the low productivity and exorbitant cost of drug development, widely estimated at $1.8 billion per approved drug. A 2013 Forbes analysis suggests that for large biopharma companies-those that earn approval for eight to 10 new drugs over a decade-the greater number of failures experienced in managing a large pipeline result in an average cost of $5 billion per approval.12
Re-estimating sample size. Sample size is fixed in traditional designs, with size based on initial assumptions about primary efficacy measures and the rate and timing of patient withdrawal from the study. This approach often results in under powering or overpowering. In the first case, the study fails to show definitive results. In the second, the trial requires more subjects and time than necessary. Adaptive designs use interim data to re-estimate sample size as the trial proceeds, so sample size can be increased to ensure adequate powering.
The 2010 FDA draft guidance makes a distinction between blinded and unblinded adaptations to maintain study power. Blinded approaches, which FDA characterizes as generally well-understood, compare interim findings to assumptions used in the planning of the study. For example, in studies that use an event outcome such as response rate for the endpoint, a blinded examination of the overall event rate can be compared to assumptions used in study planning. If the comparison shows that actual event rate is well below the assumption, sample size can be increased. Such blinded approaches also can be used in studies using time-to-event analysis and continuous outcome measures. Since blinded approaches do not introduce statistical bias or require statistical adjustments, they maintain Type 1 error control. FDA recommends that they “should generally be considered for most studies.”4
Unblinded approaches use interim analyses to estimate treatment effects. Unblinded approaches allow initial sample size to be increased if the size of the treatment effect is seen to be smaller than anticipated, but is still clinically relevant. In some cases, adaptations that address other elements of study design-such as dose, population or study endpoint-could alter the study power and require re-estimation of sample size. Changes in sample size based on unblinded data analysis may cause an increase in the Type 1 error rate, and a statistical adjustment is necessary for the final study analysis.
FDA considers unblinded approaches to be less well-understood and cautions researchers to be conservative when making changes based on early estimates of treatment effect, which can be misleadingly large or small. Due to concerns about Type 1 error and operational bias, FDA suggests that unblinded approaches be used primarily for studies in which the primary objectives cannot be achieved using blinded designs. Drug developers exploring these designs must show adequate control of Type 1 error.
Seamless designs use adaptations and interim data to combine phases into a single study, reducing timelines and the number of patients required. These designs are especially useful in oncology studies because adaptations can address a wide variety of questions in the early (Phase II) stage to improve the later confirmatory stage. Seamless designs allow the long-term clinical endpoints from subjects enrolled in an early phase to be included in overall trial results.
Seamless Phase I-II designs. Seamless designs can answer Phase I toxicity questions and early Phase II efficacy questions in the same study. A simulated Phase I-II oncology study designed by Huang and coworkers demonstrates the efficiencies that can be gained using seamless approaches.13
The authors designed a parallel Phase I-II study that combined dose determination with efficacy assessment for two oncology agents when administered in combination, and when administered concurrently versus sequentially. The trial begins with an initial period of dose escalation. Then patients are randomly assigned to admissible dose levels that are compared with each other. Bayesian probabilities are used to adaptively assign more patients to doses with higher activity levels. Combination doses with intolerable toxicity are eliminated, while those with lower efficacy are temporarily closed. The trial would be halted if the posterior probability of safety, efficacy, or futility crosses a pre-specified boundary.
Applying this design to a combination chemotherapy trial for leukemia, the authors used simulations to compare the seamless Phase I-II approach to a conventional design with separate Phase I and Phase II trials. Results showed that the Phase I-II design reduced sample size was better powered and was more efficient in assigning more patients to doses with higher efficacy levels.14
Seamless Phase II-III designs. Larger Phase II studies can increase the probability of success in Phase III but also increase research timelines and costs. In many cases, Phase III success rates can be improved and overall timelines reduced using a seamless Phase II-III design that combines the learning-and-confirming phases into a single study. The first stage generates information to guide the confirmatory stage regarding decisions such as: whether to stop for futility; what dose, regimen, endpoint, and responding subpopulation to study; and whether to evaluate the experimental drug alone or in combination with another therapy.
Figure 2 shows a seamless Phase II-III design for a trial to evaluate two experimental drugs, alone and in combination, as adapted by Berry from “A National Cancer Clinical Trials System for the 21st Century.”7 In this example, the single agent, Drug B, is selected in Phase II and continues into Phase III. The number of patients and randomization in Phase II are chosen adaptively. Phase II results determine sample size in Phase III. Phase III may use interim analyses to halt early for either futility or expected success. Berry notes that the Drug B-versus-control element during Phase II may be counted in the Phase III comparison (i.e., inferentially seamless), or it may not be counted (i.e., operationally seamless). The entire trial must be simulated to control the Type 1 error rate.
Like the use of CRM in dose determination, the adoption of seamless designs in oncology studies is slow. When we broadened our key word search of The Oncologist and the Journal of Clinical Oncology to include all trials at any phase of development, we found only three published studies (all in Journal of Clinical Oncology) that used adaptive designs between August 2012 and August 2013: two used adaptive randomization strategies, while one was a seamless Phase II-III trial.14,15,16
A 2012 survey conducted by the DIA Adaptive Design Scientific Working Group17 suggests a considerable increase in the use of adaptive design, particularly compared to a previous survey conducted in 2008 (i.e., before the publication of the draft FDA guidance). The survey of 16 biopharma companies and CROs showed more enthusiasm overall for adaptive design within industry and academia, and in particular an increase in the number of trials using designs described as less well understood in the draft FDA guidance (i.e., typically more complex adaptive designs). The Tufts Center for the Study of Drug Development also showed that, based on a roundtable discussion held in 2013 with 40 senior executives,18 across the industry simple adaptive designs (such as early stopping due to futility and sample size re-estimations) are used on approximately 20% of clinical trials and that the adoption of adaptive design in the exploratory drug development phase is expected to increase significantly over the next several years.
The potential of adaptive design to advance oncology drug development is evident in the groundbreaking I-SPY 2 screening trial, a collaborative Phase II research platform sponsored by the FDA and used by multiple industry and academic researchers. I-SPY 2 is designed to identify active experimental drugs for breast cancer, together with predictive biomarkers.5,19
I-SPY 2 uses an adaptive design to simultaneously screen Phase II anticancer agents in women with stage 2 or 3 breast cancer at risk for recurrence. Drugs are evaluated by class, using standard and emerging biomarkers to measure their impact on pathologic complete response (pCR), a predictor of disease-free survival. Drugs considered successful in the screening trial are predicted to have an 85 percent likelihood of success in a confirmatory, randomized trial of 300 patients with tumors that have the drug’s identified biomarker signature. The ultimate goal is to evolve a new model to streamline clinical evaluation and accelerate regulatory approval pathways.
The first two “graduates” from the I-SPY 2 trial are veliparib in combination with carboplatin and standard neoadjuvant chemotherapy in the triple-negative breast cancer subset, and neratinib in combination with standard neoadjuvant chemotherapy in HER2+/HR- breast cancer. Details of the clinical results and predictive probability of success are shown in Tables 3 and 4.
Table 3. Bayesian predictive probability of success for veliparib
The graduating arm is triple negative (HER2-/HR-) subset with a 93% Bayesian probability of success in a 300 patient Phase III trial.
Table 4. Bayesian predictive probability of success for neratinib
The graduating arm is the HER2+/HR- subset with a 78% Bayesian probability of success in a 300 patient Phase III trial. (Reprinted by permission from the American Association for Cancer Research.20)
Each drug’s Bayesian predictive probability of success is calculated for each unique patient subset until the threshold of 85 percent is met within any given subset. When 85 percent probability of success is reached, the accrual is stopped within this subpopulation and the drug graduates to a separate Phase III trial within the defined subpopulation. While the published probability of Phase III success is greater than 85 percent for veliparib in the triple-negative breast cancer subset, neratinib’s predictive probability of success was 78 percent at the time of publication.
The benefits of the I-SPY 2 trial are illustrated with the graduation of both neratinib and veliparib. Development has been accelerated and focused on the patient population with the greatest probable benefit from treatment with the selected drugs, which leads to the greatest likelihood of success in a pivotal Phase III trial. Interestingly, without participating in this collaborative trial, these agents may have been in competition following traditional drug development pathways with a lower probability of success for each compound in a broader population. Having graduated in unique patient subsets, the compounds are no longer competing for the same patient population. This property of the I-SPY 2 trial enhances the development of multiple novel agents in breast cancer, which is increasingly recognized as consisting of many distinct sup-types of disease.
Regulatory guidance recognizes the value of adaptive design, and emerging research models like I-SPY 2 demonstrate its great value in advancing oncology drug development. It remains for the biopharma industry to implement and advance adaptive design as a fundamental clinical research methodology.
Dirk Reitsma, MD, is Vice President, Therapeutic Area Head, Oncology, Global Product Development, PPD; Austin Combest, PharmD, BCOP, MBA, is Senior Clinical Scientist, Global Product Development, PPD; Jürgen Hummel, is Statistical Science Director, Biostatistics, PPD; Ashley Simmons, PharmD, is Associate Director, Feasibility Strategy, PPD