OR WAIT 15 SECS
This guidance has both raised the stakes and improved the odds of securing label claims based on PROs.
The recent release of the US Food and Drug Administration (FDA) guidance on the use of patient-reported outcome (PRO) measures to support labeling claims (December 9, 2009)1 had been eagerly anticipated since the publication of the draft guidance.2 In the past four years, there was much speculation about the meaning and implications of the FDA's "current thinking" about the development and use of PRO measures.3 The draft guidance was designed to offer a set of guiding principles to those conducting and supporting industry-sponsored clinical trials, but it has undoubtedly generated wider interest and been more contentious than originally intended. At a time when PROs were more commonly referred to as psychological outcomes (largely considered the remit of health psychologists rather than of interest to clinicians or industry) and were viewed with more skepticism than today (in terms of their importance, relevance, and scientific merit), the draft guidance was designed to enable industry to engage with the regulatory authority in a dialogue about the appropriate use of PROs to evaluate medicinal products. However, while the draft guidance was largely welcomed as a means of encouraging the adoption of scientific standards, it generated more questions than answers about the nature of PRO research and the level of evidence required to demonstrate good scientific practice.4-6
In the two years following the release of the draft guidance, there was a reduction in the number of successful label claims (compared with the preceding five years),7 but the percentage of successful PRO claims remained steady. This seems to suggest that fewer applications were made (perhaps due to the perceived challenges inherent in meeting the recommendations of the draft). Furthermore, successful claims were largely symptom based, providing compelling evidence of the difficulty (or perceived difficulty) of securing a claim based on more contentious (though, arguably, more patient-centered) concepts such as psychological well-being, treatment satisfaction, or health-related quality of life. Thus, while the draft guidance raised the profile and value of PROs in the industry, it also positioned many hurdles that smaller pharmaceutical companies might struggle to overcome. Preparation of a PRO evidence dossier has cost implications at every stage: planning; conduct of background studies such as systematic reviews or qualitative studies; questionnaire design, validation, and translation; and compilation of the evidence dossier itself. Moreover, access to outcomes research expertise (either in-house or externally), is essential to ensure that the evidence dossier will stand up to the rigor of FDA scrutiny.
WINSTON DAVIDIAN/GETTY IMAGES
So, does the release of the final guidance mean that the FDA is now placing its cards on the table, or is it just raising the stakes?
PROs are outcomes reported directly by patients by means of a self-report questionnaire about aspects of their condition or treatment. The FDA guidance, and the related guidance from the European Medicines Agency (EMA),8 is supportive of the use of PROs because:
Furthermore, patient preferences and priorities influence adherence to treatment. In the long term, the reported clinical and cost-effectiveness of a drug may become questionable if patients do not take it as prescribed.
In the draft guidance, the FDA provided examples of the concepts included under the umbrella terminology of PROs, including symptoms, activities of daily living, health status, and quality of life (QoL). The updated guidance no longer defines PROs in this manner, an omission that can be interpreted in two ways. Either the FDA now assumes that the term PRO is well-understood, requiring no further explanation, or the FDA does not wish to become embroiled in the debate regarding individual concepts (which remain ill-defined and somewhat contentious). In the glossary, the FDA continues to assert that claims cannot be made on QoL per se ("a general concept that implies an evaluation of the effect of all aspects of life")1 but can be made on health-related or disease-specific QoL, (i.e., "the patient's general perception of the effect of illness and treatment on physical, psychological, and social aspects of life").1 The guidance states "generally, findings measured by a well-defined and reliable PRO instrument in appropriately designed investigations can be used to support a claim in medicinal product labeling if the claim is consistent with the instrument's documented measurement capability."1
The delay in the release of the guidance has been due, in no small part, to the number of concerns raised by the draft guidance among key stakeholders. More than 50 individual responses were received from stakeholders (academics, health professionals, policy makers, professional organizations, and industry representatives). Thus, it is should be no surprise that the focus and emphasis of the guidance has shifted somewhat from the FDA's original stance. As it is beyond the remit of this article to provide a line-by-line account of the changes between the draft and final guidance, the following serves to highlight some key issues.
The guidance describes how the FDA "reviews and evaluates existing, modified, or newly created PRO instruments used to support claims in approved medical product labelling"1 and goes on to state that a PRO instrument is a "means to capture PRO data used to measure treatment benefit or risk in medical product clinical trials."1 In these short introductory statements, the FDA has made two important changes. First, the focus has moved from describing how the FDA evaluates "PRO instruments" to acknowledging the different approaches that may be needed with regard to reviewing the suitability of "existing, modified, or newly created" instruments. Second, the guidance now takes a more balanced perspective, acknowledging that PRO data can be used to measure treatment risk (e.g., side effects, inconvenience) as well as benefit; risk was not previously mentioned.
Five years ago, the draft guidance introduced the language of conceptual frameworks, which was met largely with confusion in the industry and much speculation about its meaning. Professional meetings have since been dominated by presentations explaining this and related terminology ("conceptual models" and "endpoint models"). The new guidance begins with a discussion of the use of endpoint models, which serves to emphasize the importance of matching chosen outcome measures to specific treatment objectives, (i.e., defining "the role a PRO endpoint is intended to play in the clinical trial)."1 The conceptual framework of a PRO instrument "explicitly defines the concepts measured by the instrument in a diagram that presents a description of the relationships between items, domain (subconcepts), and concepts measured."1 Diagrams and further explanations are provided to improve clarity. Endpoint models and conceptual frameworks may be relatively new terminology but they are important tools to use and represent good practice in ensuring the appropriate selection of PRO instruments.
Content validity is "the extent to which the instrument measures the concept of interest."1 This aspect of an instrument's properties is highlighted as fundamentally important (it now has its own dedicated section) with the following significant statement: "evidence of other types of validity...or reliability...will not overcome problems with content validity because we evaluate instrument adequacy to measure the concept represented by the labeling claim."1 Sponsors are encouraged to support the adequacy of content validity by documenting how items were generated. The guidance states that the FDA "cannot provide recommendations for the number or size of the individual patient interviews or focus groups for establishing content validity...generally, the number of patients is not as critical as interview quality and patient diversity,"1 confirming the importance of sound qualitative research to inform instrument design.
Importantly, the FDA acknowledges from the outset that "PRO instrument development is an iterative process and [they] recognize there is no single correct way to develop a PRO instrument. Different strategies and methods can be used to address FDA review issues."1 Previously, the standards for instrument development suggested by the draft guidance may have been very difficult to achieve, particularly for those developed and validated pre-guidance. Now, the FDA makes explicit that existing instruments with a less than rigorous development history can be considered if "new qualitative work similar to that conducted when developing a new instrument can provide documentation of content validity."1
The guidance has expanded significantly in this area, detailing how content validity can be demonstrated and documented. The assessment of content validity includes detailed examination not only of item generation and patient understanding of items and response options but also the appropriateness (to the patient group) of: the data collection method and administration mode; recall period; and instrument format, instructions, and training. The guidance also presents criteria against which content validity and respondent and administrator burden will be assessed. In relation to the recall period, the FDA continues to encourage the use of items with appropriately short recall periods,1, noting that instruments that require recall over a longer period (i.e., relying on memory) may threaten content validity.
In the draft guidance, it was not clear whether all listed measurement properties were essential for each instrument. For example, it may be appropriate in some cases to assess predictive validity, while in other situations it may not be feasible or relevant. The new guidance removes predictive validity, focusing specifically on content (discussed above) and construct validity, which the FDA will review in terms of convergent, discriminate, and known-groups validity (all of which are appropriate to the validation of any PRO instrument). In terms of reliability, the FDA continues to value all three types (i.e., test-retest, internal consistency, and inter-interviewer/inter-rater) but provides greater clarity about the specific situations in which their assessment may or may not be feasible (e.g., test-retest may not be possible for "remitting and relapsing or episodic diseases").1 In particular, the guidance emphasizes the need to demonstrate the instrument's ability to detect change and states that the FDA will want to examine "evidence that a PRO instrument can identify differences in scores over time in individuals or groups (similar to those in the clinical trial) who have changed with respect to the measurement concept."1
Increasingly, instruments are modified before use according to the needs of the new study, (e.g., transfer from paper-and-pencil to electronic format such as PDA, tablet, or web) or to a new population (e.g., adolescents or a new condition), or to a new language/culture. However, an instrument's development and measurement properties are specific to its original application. Previously, the draft guidance stated that "the FDA intends to consider a modified instrument as a different instrument from the original and will consider measurement properties to be version-specific."2 This strict categorization suggested that "additional validation" would almost certainly be required, raising many concerns as to what would be considered adequate and/or that very expensive and time-consuming new studies would be needed. Responses to the draft guidance indicated that minor modifications should not require complete re-validation.
The new guidance provides a list of changes that may require additional validation (e.g., changing an instrument from paper to electronic format; changing the application to a different setting, population, or condition) and provides brief guidance to suggest that a small feasibility study or a qualitative study may be needed to confirm the instrument's properties in the new population. In recent years, task forces set up by the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) have produced "good research practice" publications that provide guidance on what constitutes sound scientific evidence (and may also be considered reasonable) in relation to specific examples of modification.5,6,9
Importantly, proxy-reported outcome measures (e.g., for pediatric or cognitively/communication-impaired populations) are now actively discouraged, when previously they were considered a less than ideal option. This raises new challenges about the collection of "observer reports that include only those events or behaviors that can be observed."1
Previously, the draft guidance suggested that determination of the minimum important difference was key to the interpretation of results. Yet, practical and consistent methodologies for evaluating the minimum important difference are not available and the various options often produce conflicting ranges of what can be considered "minimum." The anchor-based approach often means, in practice, that PRO scores are mapped onto clinically important changes in clinical outcomes to define a responder on the PRO instrument. This may be relevant for some symptom-based or health status measures but is inappropriate for other PROs such as emotional well-being (e.g., anxiety or depression) or health-related QoL, that are less closely related to clinical endpoints. Consequently, the previously recommended minimum important difference has been replaced by the cumulative distribution function of responses to demonstrate treatment effect, or by responder definition (which must be defined prior to the start of Phase III studies).
In response to the draft guidance, stakeholders speculated about the minimum evidence base and format for presenting supporting documentation about PRO instruments. Speculation is no longer required, as the guidance provides an appendix in which the contents of an evidence dossier are made explicit. This is one of the most significant changes in the guidance, which now provides practical advice about the presentation of evidence that the FDA will expect to review.
The FDA guidance provides a set of guiding principles to assist industry sponsors in the appropriate selection, development, validation, implementation, and analysis of PRO instruments in clinical trial programs. The guidance has changed in many ways (in response to the concerns and issues raised by numerous stakeholders) to provide clearer messages for researchers, which emphasize the sound scientific principles by which PRO instruments should be developed and implemented. The guidance offers a structured understanding of FDA standards for integrating PRO effectiveness endpoints in clinical trials, particularly to support future labeling claims. Thus, by emphasizing good science but also providing greater clarity about the evidence required for FDA review, we believe that the FDA has simultaneously raised the stakes and improved the odds of securing label claims based on PROs.
Clearly, the submission of a PRO evidence dossier requires a huge volume of supporting work. For some, the language of the FDA guidance may be daunting and the requirements may appear to present insurmountable challenges (to the point that some may be discouraged from using PROs at all). However, this may be a good thing if it precludes the unscientific practice of last-minute inclusion of any vaguely relevant PRO instrument. If pharmaceutical companies believe in the PRO claim to be made for their compound, they need to proactively plan for it as early as Phase I/II. The guidance emphasizes the need to formulate a clear strategy for the inclusion of PROs in clinical trial programs, just as for other clinical endpoints. Interestingly, it is widely speculated that the FDA will soon be turning its attention to the validity/reliability of clinician-reported outcomes, a decision which may not be popular but certainly promises to bring greater scientific rigor to outcomes that are often regarded as more robust than PROs.
Most importantly, the guidance gives pre-eminence to content validity when evaluating the suitability of a PRO instrument and emphasizes the importance of specifying the role of the PRO instrument among the range of trial/research endpoints. Fundamentally, these are good scientific practices that can and should be adopted by industry to ensure that outcomes reported from the patient perspective are both robust and meaningful.
Jane Speight,* PhD, is Director of Research, e-mail: email@example.com, and Shalleen Barendse, PhD, is a Health Psychology Consultant at AHP Research, Brunel Science Park, Kingston Lane, Uxbridge, UB8 3PQ.
*To whom all correspondence should be addressed.
1. US Department of Health and Human Services FDA Center for Drug Evaluation and Research, US Department of Health and Human Services FDA Center for Biologics Evaluation and Research, US Department of Health and Human Services FDA Center for Devices and Radiological Health, "Guidance for Industry: Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims," 2009, http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM193282.pdf.
2. US Department of Health and Human Services FDA Center for Drug Evaluation and Research, US Department of Health and Human Services FDA Center for Biologics Evaluation and Research, US Department of Health and Human Services FDA Center for Devices and Radiological Health, "Guidance for Industry: Patient Report Outcome Measures: Use in Clinical Medical Product Development to Support Labeling Claims: Draft Guidance," HQLO 2006; 4 (79).
3. Alan Shields, Chad Gwaltney, Brian Tiplady, Jean Paty, and Saul Shiffman, "Grasping the FDA's PRO Guidance: What the Agency Requires to Support the Selection of Patient-Reported Outcome Instruments," Applied Clinical Trials, August 2006, 69-72, 83.
4. Keith Wenzel, Bill Byrom, and David Stein, "Putting the "e" in FDA's Draft PRO Guidance," Applied Clinical Trials, March 2007, supplement, 12-16.
5. S. J. Coons, C. J. Gwaltney, R. D. Hays, J. Lundy, J. Sloan, and D. A. Revicki, et al, "Recommendations on Evidence Needed to Support Measurement Equivalence Between Electronic and Paper-Based Patient-Reported Outcome (PRO) Measures: ISPOR ePRO Good Research Practices Task Force Report," Value in Health, 12 (4) 419-429, (2009).
6. M. Rothman, L. Burke, P. Erickson, N.K. Leidy, D. Patrick, and C.D. Petrie, "Use of Existing Patient-Reported Outcome (PRO) Instruments and Their Modification: the ISPOR Good Research Practices for Evaluating and Documenting Content Validity for the Use of Existing Instruments and Their Modification PROTask Force Report," Value in Health, 12 (8) 1075-1083 (2009).
7. M. Caron, M.P. Emery, P. Marquis, E. Piault, J. Scott, "Recent Trends in the Inclusion of Patient-Reported Outcome (PRO) Data in Approved Drugs Labeling by the FDA and EMEA," PRO Newsletter, (40) 2008.
8. Committee for Medicinal Products for Human Use, "Reflection Paper on the Regulatory Guidance for the Use of Heath-Related Quality of Life (Hrql) Measures in the Evaluation of Medicinal Products," 2005, www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC500003637.pdf.
9. D. Wild, S. Eremenco, I. Mear, M. Martin, C. Houchin, M. Gawlicki, et al., "Multinational Trials—Recommendations on the Translations Required, Approaches to Using the Same Language in Different Countries, and the Approaches to Support Pooling the Data: The ISPOR Patient-Reported Outcomes Translation and Linguistic Validation Good Research Practices Task Force Report," Value in Health 12 (4) 430-440, (2009).