Just How ‘Real’ is Real-World Data?

Addressing sources of tension on differentiating types of data.

A few years ago, many in the life-science industry questioned whether real-world data (RWD) would have any application in drug development. Due to the growing digitalization of healthcare, it is now certain that it has huge potential, especially given the release of FDA guidance concerning the use of electronic health records and medical data for regulatory decision making for drugs and biologics. As the collection and use of RWD naturally evolved during the pandemic, with decentralized clinical trials taking off, it is clear that the industry has only scratched the surface in this area and that RWD is playing a growing role in drug development around the world.

However, real-world data means different things to different people. In general, the term applies to observational data that is collected in a routine clinical care environment, as opposed to data acquired in a more structured setting, like a Phase III clinical trial, where the environment is more controlled. In what is deemed to be the ‘real’ world, treatment is given and monitored in a patient’s natural environment, factoring in the influence of related medications, medical interventions, or concurrent illnesses, as well as the realities of life, such as accidentally missed doses etc. These external factors will heavily influence treatment outcomes and, therefore, the data generated as the output. Therefore, it is reasonable to question just how ‘real’ the data is in this setting, and whether it is, in fact, any more ‘real’ than data from a clinical trial setting.

The difference in the data collected in each of these settings really has nothing to do with how ‘real’ it is; data on a case report form designed by the clinical trial sponsor is just as ‘real’ as a clinical note obtained from a patient’s bedside chart to document their response to treatment in routine care. To better illustrate the possible confusion caused by the use of the term ‘real,’ we shall now look at five variables that are sources of tension between the more “natural” and the more “artificial” data.

1. Type of study: RWD (safety) is systematically collected in pharmacovigilance (PV) programs after the first approved indication is obtained and for as long as the drug is marketed, to supplement Phase III clinical trial safety evidence from registration trials and provide cumulative information on the evolving risk-benefit profile of the new treatment post launch. RWD (safety and efficacy) can also be collected to support label changes for products already approved, for example, use in a new population (different age group, different disease severity, change in dose, or route of administration, or adding a new indication (repurposing of existing medications)). Finally, RWD can also be collected to understand the impact of the new drug x indication on health economics and outcomes. So, despite RWD being a more commonly used term in routine clinical care, these are some obvious examples of how it is collected and used in combination with clinical studies and for regulatory decision making.

2. Type of data source: Clinical report forms (CRFs) and their electronic versions (eCRFs) are not typically viewed as RWD, since they are not routinely collected by accountable care organizations (ACOs) and hospitals in general. Electronic health records managed by ACOs are considered RWD for the opposite reason. Data captured from wearables at the edge of the IoT network with experimental protocols cannot yet be considered as routine, even when the wearables are already approved by the regulators for their use in the study, but the situation is changing as a result of their increased adoption in routine care. So, the acceptance of whether data is considered as RWD, or ‘real,’ is not really linked to how ‘real’ it is, but to how routine it is. There is a big difference between the two terms.

3. Location of data capture: The pandemic accelerated the adoption of decentralized clinical trials (DCTs), which have replaced traditional clinic visits with remote visits or home visits by mobile health care professionals (HCPs). Here, the “realness” aspect has to do with a reduction in participant burden and keeping one’s daily activities as unchanged as possible. Accordingly, this type of data collection can be viewed as more "natural" but there is a limit to this, as training on mobile applications and internet competence may cause an increase in participant burden or restrict the target population to tech-savvy participants, making the data less ‘real’, or, in other words, less reflective of the total patient population.

4. Departures from routine clinical care: Even though prospective real-world evidence (RWE) studies are playing a growing role in healthcare decision-making, it is still unusual for them to be solely and strictly based on routine clinical care. Typically, additional data needs to be collected that increases the burden on participants and HCPs, but to a lesser extent than in conventional randomized clinical trials (RCTs). There is a fine balance between minimizing the burden on patients and HCPs and leveraging patients' willingness to provide informative data. Randomized assignment of treatment alternatives is never done in routine clinical care, so the data could be considered, again, to simply be less representative, rather than real or not real.

5. Type of data consumer: For regulators, data quality and assurance requirements for RWE studies should be in line with standard expectations for conventional RCTs. Randomized treatment assignment remains their preferred design, as it avoids physicians’ treatment allocation bias. Marketing and sales groups, however, which are the largest consumers of HEOR (Health Economics and Outcomes Research) studies, do not have to adhere to those standards. So again, the ‘realness’ of the data could be questioned by both of these audiences.

In conclusion, the reality is that almost any data generated could be called ‘real’—especially when you take into account that the opposite of real is fake—whether it is collected in a randomized clinical trial or in a routine care environment. However, it would usually sit on a scale of ‘realness' depending on its origination, use, and the five variables discussed here. So, in my opinion, the term real is a little misleading in the phrases RWD and RWE, as are the current limitations on when they are commonly used. All data is real, just how real is a matter of context and individual interpretation.

There is an unavoidable trade-off between minimizing sources of bias and introducing design features that depart from routine clinical care, thus impacting how ‘real’ data might be perceived to be. The result is that RWE is very rarely based on RWD that have not been subjected to some degree of scientific “adulteration” to facilitate the interpretation of the results.

Pierre Etienne, CMO and Co-founder, Actu-Real, Inc.