Is basing drug efficacy on the site read risky business?
Medical imaging in trials has been used as a measure of efficacy for more than 20 years. While the clinical practice of radiology provides sufficient information for managing individual patients, this practice varies geographically within the United States and globally. It was recognized by FDA that variability in evaluating trial images at the sites necessitated a more standardized and controlled review process, which was described in the agency's 1994 "Points to Consider for Developing Medical Imaging Drug and Biologic Products."
In this document, recommendations for the blinded independent central review of images (blinded read) from trials for medical imaging contrast agents were outlined. This independent review was meant to improve the quality of imaging data evaluations from clinical trials. The blinded read process used today across therapeutic areas is derived primarily from FDA and diagnostics industry documents and forums. It is this diagnostics-derived blinded read process which was included in FDA—and other regulatory agency's—therapeutic guidance documents.
Kohkan Shamsi, MD
In clinical trials of diagnostic imaging agents, a radiologist, not a clinician, is usually the principal/site investigator. In spite of this, it was recognized that standardizing image review, evaluating reader performance, and eliminating bias in the reviews was nearly impossible—even with an identified radiologist investigator performing the reads at each site. In today's therapeutic trials, it is uncommon to have a dedicated radiologist responsible for image interpretation at the site and for signing 1572 forms. At many sites, a nonradiologist reviews the images to select and measure target lesions and completes the imaging section of the case report form (CRF). This is either a nonradiologist physician, or even the site coordinator. While some nonradiologists may believe they can accurately interpret routine CT scans or radiographs, it takes a trained radiologist to evaluate the complexity of imaging effects seen in pathology (edema, hemorrhage, calcification, arterovenous shunting) or to distinguish tumor from nontumor imaging effects in cancers with ill-defined margins.
Methodology issues have always plagued site reads for oncology and other therapeutic drug trials. A radiologist's clinical interpretation is designed for individual patient management, but rarely meets requirements to enable selection and measurement of lesions and may use nonstandard terminology. Standardized imaging efficacy criteria used in trials (e.g., Response Evaluations in Solid Tumors [RECIST] and others) are neither commonly used in clinical radiology practice, nor are they routinely included in radiology residency training programs. Site reads being performed in oncology studies are also not fully auditable to the extent of blinded reads. Training of blinded readers on trial-specific imaging efficacy criteria occurs routinely and is accepted to have a positive impact on the quality of results. However, training of site readers—radiologists or otherwise—is usually a less rigorous process, not infrequently performed by a CRA or other imaging nonexpert. Because of their relatively uncontrolled nature, the site read process, particularly in oncology clinical trials, is considered by some to be "the wild west."
There are many factors that influence reader performance, including experience, expertise, evaluation criteria used, viewing hardware/software, reader fatigue, disease process, drug mechanism of action, and design of blinded read. The evaluation of reader performance is viewed as a reflection of the quality of imaging data results from a trial. Intra-reader and inter-reader variability are easily evaluated in blinded reads where CRFs are linked to the image database, measurements made on images directly populate the CRF, and response is automatically derived. However, similar standard methods to evaluate site reader performance in multicenter trials for regulatory submission have yet to be established.
When two or more readers evaluate the same group of images there exists a potential for disagreement. In the diagnostics blinded reads of the early 1990s, a group of three readers, each of whom evaluated all images, were commonly used. The same process continues to be used in blinded reads of diagnostic studies. In this read design, the performance of readers is compared to a standard of reference and thus sensitivity and specificity can be calculated for each reader with weight given to the two readers with the best performance.
However, in most therapeutic trials, oncology being an example, there is generally no standard of reference against which the imaging results are compared—that is, there is no established "truth" against which to compare the readers' results. A process for adjudicating differences between readers was established in oncology trials whereby two primary readers each read all images, and a third independent reviewer is asked to review subjects' images when the primary readers disagree. Generally the adjudication reader does not create new measurements, but is asked to agree with either Reader 1 or Reader 2. Thus two readers agree on response in all subjects.
In more than a decade of widespread use of dual primary readers/single adjudicator blinded read paradigm, it has become understood that there are multiple factors that impact adjudication rates in trials. A few of these factors include imaging modality, disease process, lesion morphologic appearance and its impact on the ability to precisely measure lesions, blinded read study design, efficacy evaluation criteria, image quality, reader training, study population, and many others. Regulatory authorities have never elucidated specific acceptable or nonacceptable adjudication rates, and rates averaging 25% to 40% are routinely observed in oncology registration trials.
Since there are so many factors that can influence the adjudication rate within a single trial, the adjudication rate cannot be viewed as a direct measure of either overall study quality or a specific measure of reader performance.
Recently, adjudication rates have come under scrutiny by regulators and others, suggesting that a specific rate in a specific trial may be a negative performance indicator. It has also been suggested that if the adjudication rate is X and the difference between the site read and blinded read results is similar, the blinded reads could be unnecessary.
On the surface this sounds like a reasonable assumption. However, due to methodological issues that exist with many site reads for oncology trials—not the least of which being that nonradiologists and even nonphysicians are interpreting images—this assumption cannot be considered correct. Trial results derived from a process in which the methodology is less than optimal cannot be compared to results from a fully auditable and transparent process.
The blinded read process, however, is not without its own issues. Blinded reads do add additional expense to a clinical development program. All bias is not removed with blinded central image evaluations. Informative censoring can result when a patient progresses according to the site read but not according to the blinded read. However, this can be mitigated using widely-available current technology to perform independent confirmation of disease progression within 24 to 48 hours. But more than 15 years of global regulatory experience, combined with industry/regulatory/academic joint efforts to improve quality in independent image reviews, has built a broad foundation of understanding that results from controlled blinded reads are superior to those from lesser controlled site reads.
There are some easy-to-implement changes to site reads that would improve the quality of the methodology:
Identify a single radiologist (a study subinvestigator) at each site to perform evaluations, complete the imaging portion of the CRF, and sign 1572 forms.
Prospectively train and test the site readers on the study-specific efficacy criteria.
Monitor site reader performance in an ongoing fashion and implement retraining as needed.
Implement a regulatory compliant audit trail process.
More than 15 years ago, industry, FDA, and other regulatory bodies recognized that if image evaluations were to be used as a measure of efficacy, a controlled evaluation process was required. Site reads did not meet this criterion, and not much has changed in the quality of site reads since that time. This is not to say that the quality of site reads, however, cannot improve. But investment of both significant time and expense will be necessary to improve and control the quality of site reads before efficacy of a drug can be based on the site read.