Miniaturization of sensors and circuitry has enabled huge proliferation in the development and commercialization of wearable and external monitoring devices for health and wellness. Examples include cardiac and ECG monitoring devices and continuous sleep and activity monitors. Activity monitors and their associated apps and software are growing in popularity for those wanting to improve fitness or manage weight through regular exercise regimens. While some wearables are occasionally used to monitor specific patient groups in routine healthcare, the personal use of these devices as consumer products is an interesting phenomenon.
Over the years, healthcare professionals have needed to adapt to a more highly informed patient population due to improvements in the availability of accessible healthcare information through the Internet. Today, some patients will supplement this knowledge with self-monitoring data, providing an opportunity and a challenge for the treating healthcare professional. We see the same opportunities and similar challenges in the use of wearables in clinical trials, and this article gathers some of these with the aim of provoking further discussion.
We can identify some of the challenges in understanding and interpreting the data collected from wearables by examining our experience of Holter monitoring in clinical trials and healthcare. Norman Holter pioneered this innovation to continuously monitor cardiac activity in 1949, and clinical use of Holter monitors began in the early 1960s. Continuous ECG data collected using Holter monitors presents two immediate challenges:
- How do we identify valid (clean) data?
- How do we summarize the data to identify pertinent measures or segments of interest?
Holter data often contained a quantity of spurious artefacts and noise from electromagnetic disturbance due to, for example, the quality of electrode attachment, interference from muscular activity, and interference from other electromagnetic devices. Software algorithms combined with human inspection of the data can be used to remove these segments, leaving a clean trace for clinical examination. In terms of summarizing the data, Holter data is not commonly used to replace a resting 12 lead ECG in measuring the parameters describing the heart’s activity. Rather, it is used to detect rhythm disturbances that occur less frequently and would be hard to detect in a shorter period of time. Because of this, summarizing Holter data usually involves detecting segments of the continuous trace that indicate possible rhythm abnormalities, and providing these for interpretation by a clinician. This is perhaps where our similarity with Holter might end. In many cases, wearables offer us a solution to collect valuable outcomes data, which may be of interest in its entirety as opposed to detecting the presence or absence of certain signals. This brings new challenges, including knowing what to do if the patient has not worn the device for the full time period expected. In this article, we will examine these challenges, focusing primarily on activity data, which will be illustrated with data collected through personal use of an ActiGraph activity monitor.1
Patients using wearable monitors at home do so in uncontrolled conditions. Because of the lack of control and context, the meaning of the data may be difficult to interpret in some situations. For example, night-time awakenings are an important patient reported outcome in asthma treatment. Sleep monitors provide an ideal methodology to measure objectively the number of night-time awakenings experienced, which will likely be more accurate than patient recall in the morning. However, a sleep monitor is likely to over-report the outcome of interest, as it is unable to differentiate awakenings due to asthma symptoms and awakenings for other reasons.
With activity, gold standard clinical assessments exist—such as the six minute walk test (6MWT) in COPD. In a 6MWT, the distance that a patient can walk in six minutes is recorded, either using a treadmill or an empty corridor circuit. This gives a controlled assessment of functional capacity, which we use as an indicator of the level of activity a patient can achieve. However, I would argue it would be better to actually measure first-hand the level of activity the patient does achieve by using an activity monitor, but in doing so out of clinic we lose a complete picture of the context of the data measured. Activity levels can be affected by all sorts of uncontrolled variables such as environmental and social factors that we cannot account for. That said, the ability to measure more data might mitigate against additional random error associated with this approach.
Defining valid data
As we consider activity data, and any continuously recorded monitoring data, we need to understand that patients may not comply exactly with our requirements to wear the device for a specific period of the day and for a defined number of days. Given that, thought must be given to how we define valid data.
Defining periods of wear and non-wear
Using activity monitors as an example, non-wear time can be identified by a period of zero activity. This period of zero activity needs to be longer than the likely period of time a patient can sit still when wearing the device. The length of inactive period to define non-wear, therefore, will depend upon the patient population. For example, a period of 30 minutes of non-activity for children could be considered appropriate to identify when the device was not worn. In adults, due to inherent lower activity and lifestyle differences, this period requires increasing to perhaps 60 minutes. These thresholds may be modified further based on the characteristics of the indication affecting the patient group studied. Selecting the appropriate threshold will ensure accurate identification of periods of non-wear and differentiate these from periods when the patient was not active while wearing the device.
Defining a valid wear-day
Consideration should be given to the definition of a “valid day.” In addition to defining the expected period of wear each day within the study protocol, it is important to define the proportion of a day that should be recorded. It’s important to note that including or dropping days with low wear-time may bias estimates of overall activity by under- or over-estimating the true activity level. It may be more appropriate to attempt to impute missing data in some way so that activity across a standardized day can be measured.
Defining the number of wear-days required
Research shows that in adults, a monitoring period of three-to-five days is needed to estimate habitual activity.2 It is clear, however, that in many patient groups there may be substantial differences between activity on weekend days and activity on weekdays. Consider the data I collected to illustrate this in figure 1. Graph (a) shows activity on a typical weekend day, and graph (b) on a weekday. The overall level of activity is substantially lower on the weekday as a large period of the day was spent at work in the office. In addition, the level of exertion recorded differed between weekend and weekday, as I was more able to spend time running and walking more vigorously during the weekend compared to weekdays. It would seem sensible, therefore, to ensure that the number of wear-days included in our analysis would include a mix of weekday and weekend days, so that the different behaviors in each is captured and accounted for. However, in an elderly (retired) or non-working patient population, the difference in activity between weekend and weekday may be less important.
Figure 1. Example activity profile on two days of the week: (a) weekend, (b) weekday.
Dealing with missing data
When handling continuous monitoring data we may need to make assumptions and determine rules for dealing with missing data resulting from periods of non-wear. When it comes to missing days, if we assume that missing days are missing completely at random, then estimating activity for these days based on the activity observed on complete days may be reasonable. In doing so, we need to assume that activity is similar on all days—but as we have seen in my example (figure 1), it is possible that weekend and weekday activity distributions may vary, and so this may need to be accounted for if estimating the total weekly activity. However, we should be aware that non-wear might not be a random event. A user may choose not to wear the device on a particular day because they are feeling unwell and inactive, or because they are participating in a physical activity that requires the device to be removed (e.g. swimming).
Similar rules need to be made for missing data within a valid day; it is desirable to impute missing data so that we can estimate activity across a time period that is standardized for all days and patients. Scientific literature proposes a number of approaches to dealing with such missing data (see reference 3 for example).
Determining pertinent summary statistics as outcome measures
Accelerometers provide a variety of variables associated with activity. The most basic measure is counts, usually represented as number of counts per minute (as used in figure 1). Unlike pedometers that measure the presence of a movement—recorded as a step—accelerometers measure both the presence and force of the movement, and this is summarized as counts. Accelerometers also often report energy expenditure in the form of kcals and METs. One MET represents the amount of energy the human body requires at rest. Both of these derived variables are based on correlations with energy expenditure experiments.
There are many ways in which continuous activity data can be summarized to create relevant summary statistics, and in selecting these we need to consider the patient population and the objectives of the study. Translating counts into METs or kcal expenditure may not be accurate in patients with compromised fitness such as COPD, as the calibration equations used are normally based on relating activity to energy use in healthy subjects. Compromised patients may in fact expend more energy compared to healthy subjects when completing the same activity.
Increased mobility in compromised patients may be better shown in the duration of bouts of gentle activity, such as walking. The total counts per day may not be sensitive to detecting these sorts of subtle changes, but measures such as the number of times a patient walked, the duration of walking achieved, and the number of walking episodes exceeding a defined duration (e.g. 60s) may be more sensitive to detecting change and improvement. The objective here is to identify activities the subject has elected to undergo as opposed to routine activity required through necessity—such as using the restroom.
In healthier subjects, total counts per day may be a useful measure, but there are also other valuable summary measures that may be sensitive indicators of change in activity. A good example is the time spent in different levels of exertion, such as moderate physical activity and vigorous physical activity. Cut points based on the counts per minute recorded can be defined that relate to different levels of activity (see reference 4 for example, based on treadmill data). Figure 1 illustrates this for cut points defining moderate and vigorous activity. The time spent in different states of activity such as sedentary, moderate, vigorous and very vigorous activity can then be calculated (table 1). These may provide a more sensitive measure of increased fitness or activity than a measure of total activity or energy expenditure over the day.
For application in clinical trials, it is important to consider the summary outcome measures that should be calculated. These will be selected based on knowledge of the patient population and how improvements and changes in mobility and activity might be observed.
Conclusion: The need for Standards
There is no doubt that wearables offer the industry enormous potential to measure and learn more about clinical trial participants. Understanding activity should be far more informative when measured in the real world as opposed to via a fitness test in clinic. There has been much academic work published on how to define, clean, and summarize data collected using wearable devices, such as activity monitors, and this should be the basis of identifying standards and gaps in knowledge that may need further investigation.
It’s critical that our industry collaborates and works with the academic and regulatory communities to define standard approaches for using devices in clinical trials and for specific indications and patient groups. For activity assessment, these standards should include:
- Where the activity monitor should be located (e.g. wrist, arm, waist, ankle etc.)
- Sampling interval (epoch)—e.g. 30s, 1min.
- The number of days of measurement required, and how often measurement periods should occur during a trial
- Definition of non-wear episodes
- Definition of a valid day, if appropriate
- How to handle missing data—both missing days and missing periods within a day
- Relevant meaningful and sensitive activity summary measures
The above items are likely to vary for different patient groups and disease indications, and so it will be important to define these on a case-by-case basis. Let’s get active in this area and begin to really leverage the potential these devices can offer clinical drug development.
Bill Byrom, Senior Director, Patient Direct, PAREXEL Informatics
- ActiGraph. http://www.actigraphcorp.com/product-category/activity-monitors/
- Ward DS et al. (2005) Accelerometer Use in Physical Activity: Best Practices and Research Recommendations. Medicine and Science in Sports and Exercise; 37:S582-S588
- Catellier DJ et al. (2005) Imputation of Missing Data When Measuring Physical Activity by Accelerometry. Medicine and Science in Sports and Exercise; 37:S555-S562
- Freedson PS, Melanson E, Sirard J. (1998) Calibration of the Computer Science and Applications, Inc. accelerometer. Medicine and Science in Sports and Exercise; 30:777-781