OR WAIT null SECS
Overcoming method validation challenges for trial populations during biologic drug development.
Traditional drug development has focused on the development of small, synthesized molecules— chemical entities that don't typically induce an immune response in the body. More recently, protein therapeutics, antibodies, and other large molecule biologics have become a greater proportion of drug development pipelines. In 2007, global prescription sales of biotech drugs increased 12.5% to more than $75 billion, with the United States providing the largest market for biotech products, which represented 56% of total 2007 U.S. sales.1 Along with the promise of biologics—monoclonal antibodies (mAbs), proteins, and peptides—come special challenges in the preclinical and clinical setting. These types of drugs, unlike small molecules, almost always induce an immune response upon administration.9 This immunogenicity can be desired, as in the case of a vaccine or cancer immunotherapy, or it can be health- and/or life-threatening.
A few years ago, it was believed that small changes in the manufacturing or handling of the large molecule drug erythropoietin (EPO) resulted in increased immune reactions in several patients that prevented the differentiation of red blood cells—pure red cell anemia—and targeted the patients' own EPO in addition to the exogenously dosed drug version. Although the incidence was rare, the patients required frequent blood transfusions to overcome the problem.2,9 Less alarming but also of concern: The development of immune responses to biologics has been shown to reduce a therapeutics efficacy in animals7 and in humans.2
These examples make a clear case for the importance of immunogenicity testing, beyond any regulatory requirements, in assessing the safety of protein therapeutics. Monitoring antibody responses during biopharmaceutical development is a necessary component of any prudent risk management strategy. Some companies are performing these assays even in preclinical stages in order to evaluate safety implications as early as possible in drug development.
The factors influencing immunogenicity are varied. The cause could be an underlying disease; and, for a single product, immunogenicity may need to be studied separately for each disease. Other factors include concomitant treatment, the duration or route of administration, the protein structure, aggregation, and excipients.
The immune response can produce antibodies with a number of different characteristics. They can be neutralizing, in that they prevent the product from having the desired effect. Sustaining antibodies will help maintain the effect of the product. Clearing antibodies accelerate the product's removal from the system, and cross-reactive antibodies bind to endogenous proteins. Any of these responses can develop into chronic conditions and complications in patients.
Therefore, it is essential to adopt an appropriate strategy for assessing any unwanted immunogenicity of biological products. The challenge is to design validation methods—antibody assays—that can accurately determine if trial subjects have generated an immune response to the biologic under study. This requires the development of appropriate assays that are fit for purpose, meaning they provide the needed information for the particular circumstance.
Immunogenicity screening relies on setting accurate, statistically-based cutoff thresholds to distinguish between positive and negative samples. If thresholds are set too high or too low, the results could trigger false readings and delay the clinical program. Guidance regarding immunogenicity assay development has been developed by the European Medicines Agency in its Guideline on immunogenicity assessment of biotechnology-derived therapeutic proteins10 and two white papers have been developed outlining some critical parameters for assay development and validation.6,8
Establishing thresholds for preclinical and normal human populations is generally straightforward. It becomes more complicated when taking into account such influencing factors as the limited availability of a representative patient matrix (serum or plasma) during method development and validation, the presence of different concomitant medications, gender differences, and the presence of pre-existing antibodies. In rare or unusual populations, using a surrogate chemically altered matrix during validation may be unavoidable. This may necessitate an in-study comparison of the threshold established during validation with responses observed in patient predose samples.
A tiered approach to assay design helps to ensure that the screening is sensitive, specific, and precise enough to deliver the appropriate data. Three elements are essential: a screening assay that identifies antibody-positive samples and patients; analytical immunochemical procedures, in a different format than the screening assay, for confirming the presence of antibodies and determining antibody specificity; and functional bioassays for assessing the neutralizing capacity of antibodies. Bioassays used for measuring the potency of biological products (e.g., for lot release purposes), can often be adapted to assess neutralizing antibodies, although optimization of assay parameters are frequently required to ensure the assay is fit for its purpose.6, 4 ,8
Validations of screening assays are based on several factors,6, 8 including the following:
Negative control mean and cutoff determinations. The cutoff is the level of response of the assay at or above which a sample is defined as positive; below that level, it is negative. The cutoff is determined statistically from the response level of the control sera, using the negative control, from the population under investigation, and is based on a 95% confidence limit. Establishing the proper threshold is critical and should be based on using a matrix as similar as possible to the actual samples to be tested.
Limit of detection. It's established by preparing a "curve" of decreasing amounts of antibody (typically a surrogate with a known concentration) in a pooled negative-control matrix. The limit is set by observing where on the concentration curve the response rises above the calculated cutoff.
Lower limit of detection (LLOD). The aim is to determine the right degree of sensitivity in the assay based on evaluating the LLOD. This is accomplished by preparing a concentration of antibody at a target LLOD level in at least 20 different lots of representative matrices. The response of all matrix lots must be above the cutoff response. Mire-Sluis et al. recommend a minimum sensitivity of 200-500ng/ml (back calculated for 100% serum) for clinical studies; for preclinical studies, sensitivity should be at least between 500 and 1000ng/ml (back calculated for 100% serum).
Precision batches. These batches are run to ascertain and ensure consistency in the performance of the method. Controls are run (N=3) in six batches by two analysts over a minimum of two days. The process is similar to precision and accuracy batches for quantitative methods, but because typically there is no standard curve, only precision may be determined. Testing is performed for negative and positive controls (and low/high positive controls), and for inhibition controls (negative, low, and high) to show effectiveness of the immunodepletion step.
Stability. A referential/relational assessment is made concerning the stability of the antibody (or a surrogate) in the desired matrix, which is meant to mimic the treatment of study samples. Stability is tested on the bench top, in freeze/thaw conditions and under long-term storage conditions, and may be conducted using a positive sample. The general knowledge of antibody stability may be derived through literature references, which may not be optimal.
Drug tolerance testing. This information is useful for interpreting data from pharmacokinetics (PK) and immunogenicity studies. For example, antibodies that sequester the drug product can affect the PK profile, although samples may appear negative for antibodies when tested. Drug tolerance testing is performed by adding increasing amounts of the drug product to samples that contain increasing amounts of the antibody (a positive control).
Early clinical trials are pursued in normal human subjects. Before study initiation, during the validation of an immunogenicity method, it is recommended that matrices from a representative population of trial subjects be obtained in order to appropriately establish the cutoff threshold for determining positive samples. This can include matching the population on several variables, including age, weight, gender, and smoking status. In the majority of cases, there will be no noticeable effect of any of these criteria on the performance of the method. However, for some immunogenicity assays, differences have been observed.
Table 1. Mean optical density (OD) signal, standard deviation (SD), and cutoff values (mean + 1.645 Ã SD) for different populations in an ELISA-based immunogenicity assay.
Table 1 shows the results from a population of human matrices from male and female African American and Caucasian subjects. As can be seen from the table, there are differences in the overall performance in male vs. female matrix lots; the males have a higher mean signal (mean of male lots = 0.02561) and standard deviation than the female lots (female mean = 0.128). This is also depicted in Figure 1, which presents the distribution of assay values obtained for male and female populations.
Figure 1. A histogram displaying the distribution blank of the optical density (OD) values obtained in the immunogenicity assay.
Had the difference between male and female lots been considered together, the threshold would have been set based on the overall mean and standard deviation. The resulting threshold would not have been representative for either gender. In particular, this would be a risk for the female samples as the threshold would be set too high, potentially leading to the incorrect identification of samples with antibody concentrations as negative. The risk would have been minimized for the male samples as more false positives would have been identified and further testing of these samples would have been required to confirm initial results.
While the presence of false positives is acceptable from a safety perspective, a high number complicates the interpretation of the data, and the required confirmation step adds to the cost of the study. In this case, an alternate option using two, gender-specific cutoffs was chosen. This, however, complicated the interpretation of the sample data, as it was necessary to apply two separate cutoffs in the report.
Another similar challenge to method development is the degree to which the distribution of the naive population samples is skewed. This can be seen in Figure 2, which shows the distribution of responses for a Biacore assay (surface plasmon resonance) from a set of 50 individual matrices. The majority of responses are clustered at the lower end of the response scale; however, many of the blank lots have responses on the upper end of the response scale. Even if one employs a statistical test to exclude one or several of the outlier lots, the distribution remains skewed.
Figure 2. Distribution blank of the reflectance unit values (RU) obtained in the immunogenicity assay. Arrows indicate the cutoff threshold calculated without (A) and with (B) outlier values included.
If one employs standard procedures for setting the assay cutoff threshold for this population of samples, the cutoff would be 137.3. Setting the cutoff threshold at this level, while statistically accurate, would result in a threshold too high for the majority of lots clustered at the lower end of the response scale. An immunogenic response generated in one of those lots would have to induce a large change in response in order to be judged as positive. This would incur risk from a safety perspective, as it would question whether the assay was capable of determining a positive sample.
In this case it was deemed preferable to set a lower cutoff based on results from the majority of lots. This would likely result in an increased number of false positive results in samples distributed in the upper end of the response profile. While this may appear counterintuitive, as it is preferable that a drug product generate no positive samples, it is important to note that positive samples would need to be assessed subsequently in a confirmation assay.
A final challenging aspect of immunogenicity testing is moving the assay from the healthy normal human population used in Phase 1 studies to patient populations. In some populations, the representative matrix populations are readily available and can be obtained commercially for method development. With some disease populations, however, patient matrices may be difficult to obtain, the patient population can be small and, in some cases, requesting matrices from patients simply for method development and validation becomes an additional, unwelcome invasive procedure and can raise ethical questions regarding patient care.
Even when every effort is made to obtain equivalent matrices, commercially available disease population samples frequently will differ because they are obtained from different geographical locations, their concomitant medications are different or there is a different ethnic distribution. In those cases, one needs to consider alternatives to obtaining representative matrices. This can include continuing to use normal matrices to validate the assay, using a smaller number of matrices already available or using chemically altered matrices that are intended to mimic patient matrices.
Each alternative comes with advantages and pitfalls that must be considered. If validating with a normal matrix, one must be vigilant during the analysis of patient samples in examining how the specific study populations perform during the conduct of sample analysis. If predose samples are collected, these can be tested and compared with responses of the population used during validation. If these predose samples perform within the limits established during the validation, then no corrective action is required. If, however, these predose samples perform in a significantly different manner, it is imperative to determine whether to reset thresholds based on the performance of the patient population samples, apply a correction factor or continue to use the thresholds established in validation.
Another possibility that has been explored is to consider a sample as positive based on its response relative to a subject's predose sample. This case is illustrated in Figure 3, which shows the performance of 17 samples negative (blank) for the presence of antidrug antibodies, in this case the surrogate positive control antibody.
Figure 3. Values obtained for a set of 17 sera lots in a radioimmunoassay. The lots were run blank and spiked at the lower limit of detection (LLOD) concentration. The fixed assay cutoff threshold is shown in the solid dark line. The floating cutpoint is represented by the dashed line with the triangles.
The assay threshold established in validation is shown by the solid black line. For certain subjects, a very minimal change in assay signal would be required in order for a sample to be considered positive relative to the validated assay threshold (e.g., lot #11), whereas for others (lot #5) a much larger change would be desirable. In employing a sample-specific cutoff (the blank response + SD established in validation), the signal change required for the identification is equal and relative to the predose sample.
For illustration, the same lots fortified with the antibody at the lower limit of detection are shown by the open symbols. All spiked samples would be identified as positive using this calculation. Employing the validation correction factor to interpret data from individual samples may be risky, however, as the variability of a single lot is unlikely to match that observed for the population. Nevertheless, this illustrates an alternate approach for setting cutoffs for immunogenicity samples in patient clinical trials.
There are many challenges to developing effective immunogenicity assays for the different stages of clinical development. These require vigilance over the assay performance as well as the results from clinical samples during sample analysis. Setting of correct thresholds for determining true positive samples is critical to managing high-quality and relevant results from immunogenicity assays.
With any new biologics in development, it is important to understand potential immunogenicity issues early on so contingencies and alternatives can be developed. Without the right assays to assess product safety, the consequences can be dire—for trial subjects, patient populations, and the fiscal health of the drug development company. At the very least, delays in validation and sample testing can add unexpected costs and reduce returns on investment.
If immunogenicity problems do arise, having a flexible approach to immunogenicity assay development can help mitigate the impact.5 Thinking ahead—working with a CRO with immunogenicity testing experience, using and evaluating appropriate assays thoroughly, and taking the right risk management approach—can lead to more successful efforts in bringing the promise of biologics to therapeutic products and to patients.
Chris Beaver, PhD, is Scientific Director, Ligand Binding/Cell Based Assays, at MDS Pharma Services, 2350 Cohen Street, Montreal, QC Canada, H4R 2N6, email firstname.lastname@example.org.
1. "IMS Health Reports Global Biotech Sales," IMS Health, June 2008, http://www.imshealth.com/portal/site/imshealth/menuitem.a46c6d4df3db4b3d88f611019418c22a/?vgnextoid=bba69e392879a110VgnVCM100000ed152ca2RCRD&vgnextfmt=default.
2. P.A. Calabresi, G. Giovannoni, C. Confavreux et al., "AFFIRM and SENTINEL Investigators, The Incidence and Significance of Anti-natalizumab Antibodies: Results from AFFIRM and SENTINEL," Neurology, 69, 1391-403 (2007).
3. N. Casadevall, J. Nataf, B. Viron et al., "Pure Red-cell Aplasia and Antierythropoietin Antibodies in Patients Treated with Recombinant Erythropoietin," New England Journal of Medicine, 346, 469 (2002).
4. S. Gupta, S.R. Indelicato, V. Jethwa et al., "Recommendations for the Design, Optimization, and Qualification of Cell-based Assays Used for the Detection of Neutralizing Antibody Responses Elicited to Biological Therapeutics," Journal of Immunological Methods, 321, 1-18 (2007).
5. E. Koren, H.W. Smith, E. Shores et al., "Recommendations on Risk-Based Strategies for Detection and Characterization of Antibodies Against Biotechnology Products," Journal of Immunological Methods, 333 (1-2), 1-9 (2008).
6. A.R. Mire-Sluis, Y.C. Barrett, V. Devanarayan et al., "Recommendations for the Design and Optimization of Immunoassays Used in the Detection of Host Antibodies Against Biotechnology Products," Journal of Immunological Methods, 289, 1-16 (2004).
7. W.F. Richter, H. Gallati, C.D. Schiller, "Animal Pharmacokinetics of the Tumor Necrosis Factor Receptor-immunoglobulin Fusion Provein lLenercept and Their Extrapolation to Humans," Drug Metabolism and Disposition, 27, 21-25 (1999).
8. G. Shankar, V. Devanarayan, L. Amaravadi et al., "Recommendations for the Validation of Immunoassays used for Detection of Host Antibodies Against Biotechnology Products," Journal of Pharmaceutical & Biomedical Analysis, 48, 1267-1281 (2008).
9. H. Schellekens, "Lesson Learned From Eprex-Associated Pure Red Cell Aplasia," Kidney & Blood Pressure Research, 30 (Suppl 1) 9-12 (2007).
10. European Medicines Agency, Guideline on Immunogenicity Assessment of Biotechnology-Derived Therapeutic Proteins," http://www.ema.europa.eu/pdfs/human/biosimilar/1432706en.pdf, 24 January 2007.