OR WAIT null SECS
© 2023 MJH Life Sciences™ and Applied Clinical Trials Online. All rights reserved.
Examining the two areas of weakness cited in FDA draft guidance.
Recent FDA draft guidance on core patient-reported outcomes (PROs) in oncology clinical trials1 lays out a framework for effective patient-reported outcome measure (PROM) selection and implementation. This may be important in increasing the use of PRO data in oncology medication labeling in the US, where, historically, labeling claims based on PRO data have rarely been included. The reporting of PROM endpoints on drug labeling is valuable to both the patient and the prescribing physician in informing risk/benefit discussions beyond tumor response and survival endpoints. The lack of PRO-related labeling is clearly illustrated by the fact that only three out of 85 drug approvals in oncology submitted to FDA between 2010 and 2016 included PROM-related labeling claims.2,3 While this may be related to trial design limitations (e.g., single-arm and open-label studies), it may also reflect weaknesses in clinical outcome assessment (COA) measurement strategies typically used in oncology research.
FDA’s draft guidance addresses two areas of potential weakness in current COA measurement strategies for oncology trials.
In oncology trials, PROMs are often implemented at clinic visits just ahead of starting each cycle of treatment. While administratively convenient, this is the time at which patients are feeling well enough to receive the next cycle of treatment, so this strategy may fail to measure the impact of treatment in the earlier stages of each cycle when treatment-related side effects are likely to be most pronounced. Just measuring at cycle start may provide a biased view of treatment toxicity.
The draft guidance stresses the importance of selecting measures that are highly specific so that treatment effects can be understood more granularly. FDA identifies five core measurement domains of interest: disease-related symptoms, symptomatic adverse events, an overall side-effect impact measure (single item), physical function (aspects such as walking, lifting, and reaching that are considered important for independent functioning), and role function (impact of a treatment on the ability to work and carry out daily activities). Not all commonly used instruments provide scores or subscores that provide the level of specificity required to separately assess these core areas.
In this article, we explore these two areas in more detail.
The draft guidance recommends more frequent measurement during initial cycles, with fewer later on (see Figure 1 below). In particular, FDA indicates that symptomatic adverse events, the overall impact of side effects, and physical function should be measured frequently at the start of treatment, before moving to less frequent cadences later in the treatment and subsequent follow-up periods. This schedule helps to mitigate chances of missing the incidence and severity of items associated with these domains, as may be the case if measuring only at the start of new treatment cycles.
This increased frequency of assessments raises several practical questions:
We examine these briefly ahead:
It would be burdensome for the patient if we implemented the full 30-item EORTC QLQ-C30 questionnaire (a frequently used general oncology PROM) on a weekly basis when we only need to measure the physical function subscale (five items). However, discarding the other items within an instrument might impact the psychometric properties of the subscale, so it will always be important to verify that this approach is acceptable and valid with the instrument owner.
In the case of the QLQ-C30, the subscales are valid for independent use as discrete “item lists.” Therefore, referring to the schedule of assessments in Figure 1, the five physical function items from the QLQ-C30 could be applied independently on a weekly basis to enable more frequent assessment of this core domain. When doing so, it will be important to include the complete set of physical function subscale items, represent them in the same order as within the full instrument, and apply the same scoring and missing data rules as defined for the subscale within the full instrument.
While the draft guidance states that “methods to lessen patient burden should be explored, including use of electronic PRO capture that may allow for assessments outside of the clinic,”the more frequent assessment schedule may be perceived as burdensome when the effects of treatment are particularly compromising. Could this lead to increased missing assessments? In a qualitative interview study of oncology patients, it was identified that certain days within typical treatment cycles are the most difficult, with one patient saying: “Days three to six were my ‘dark’ days, and I did not leave my bed, speak to anyone, or even eat any food.”4
Despite this, participants in this qualitative study did not feel that this would deter or inhibit them from completing PROM instruments on electronic devices. For example, one patient stated, “You can be bad, but not so bad that you can’t use nothing at all...I would make the effort.”4
Being sensitive to this is important, and we should consider guiding principles for ePRO implementation to limit burden and aid completion rates, including:
In oncology trials, it is common for physicians to implement delays in treatment when a patient has not recovered sufficiently from the previous cycle of treatment. When collecting PROMs at the site, this typically means postponing PROM completion to coincide with the new cycle start date. However, with the change in focus to more frequent measurement, we recommend that the trial objectives will most likely be met when the measurement schedule is not adjusted to allow for changes in the start times of subsequent treatment cycles, especially when the period of frequent measurement is long enough to comfortably cover several cycles.
Separate and independent assessments of each of the five core domains requires careful appraisal and selection of the measures to be used. To measure each domain using a single scale, or a subscale of a broader instrument, questions should all be related to the domain and comprehensive enough to encompass all the meaningful elements of that domain.
While this sounds straightforward, nuances in the definitions of domains measured by different PROMs mean that researchers should carefully inspect instruments to ensure the FDA-defined domains are accounted for. Let’s consider two examples.
The NSCLC-SAQ is a seven-item symptom assessment questionnaire for non-small cell lung cancer (NSCLC)5, developed specifically to assess disease-related symptoms across five concept areas (see Figure 2 below).
The qualitative work used to identify the set of disease-related symptoms that are meaningful to NSCLC patients6 provides strong evidence of the measure’s relevance and content validity. The measure only focuses on disease-related symptoms, so in addition to content validity, it also satisfies the requirement for specificity. For this reason, it is not surprising that this instrument was cited as an example of a suitable instrument to measure disease-related symptoms in the FDA draft guidance.1
The Functional Assessment of Cancer Therapy - Breast (FACT-B) is a 37-item questionnaire that is used to measure five domains of health-related quality of life in breast cancer patients: physical well-being, social/family well-being, emotional well-being, functional well-being, and a breast cancer subscale.7 The naming of these domains differs from FDA core domains, and they represent slightly different concepts. For example, physical well-being is not the same as physical function.
If we examine the subdomain of functional well-being, we find that the first three items of the subscale measure one’s ability to work and whether that work is fulfilling, as well as one’s ability to enjoy life. Some of these map to FDA-defined domains—for example, ability to work may be a component of a measure of role function. However, the ways that the current subscales and their scoring are defined do not facilitate specific measures of one or more core domains as identified by FDA.
However, the FACT instruments are well validated and widely used. With that in mind, it may be possible and desirable to reorganize their subscale structure to measure FDA-defined core domains. This takes us to considerations around instrument adaptation.
As described, to meet FDA draft guidance around specificity to measure the five core domain areas, it is important to select PROMs carefully. In some cases, it may not be as simple as implementing an instrument in its off-the-shelf format. Instead, it may require some degree of adaptation. We discuss two examples ahead.
As described, while the subscale construction of the FACT-B does not appear to map directly to the FDA’s core domains, the instrument and its items are well-validated. To satisfy the draft FDA guidance, researchers may consider working with the scale author to validate an alternative subscale based on existing items within the instrument. As a hypothetical example, we might consider developing a specific measure of disease-related symptoms by grouping items from the physical function subdomain (e.g., lack of energy, pain, and nausea) along with items from other subdomains (e.g., shortness of breath, swelling/tenderness of the arms). This would require working with the scale author and evaluating the psychometrics of the new subscale.
Item banks are databases of individual validated PROM items. It is possible to use items banks to assemble tailored PROMs more rapidly for specific populations and measurement domains. Most applicable to oncology research are the PROMIS (Northwestern University, IL) and EORTC item banks. The latter, for example, contains around 1,000 items, with many translated in up to 100 languages.8
A good illustration of this approach is the use of the EORTC item bank to supplement items in the EORTC’s core instrument (QLQ-C30), to ensure full coverage of all important items in order to measure drug benefit in a set of rare hematological stem cell disorders.9 Instrument developers identified the symptoms and impacts important to patients with these diseases, and used the EORTC item bank to map these to concepts that were not contained within the QLQ-C30. They conducted further testing in patients to confirm the suitability and content validity of these additional selected items. We may see this type of approach more often as we carefully and strategically consider PROM selection for future oncology trials in light of the FDA draft guidance.