Translating Clinical Data


Applied Clinical Trials

Applied Clinical TrialsApplied Clinical Trials-05-01-2011
Volume 20
Issue 5

How to define information requirements to enhance the understanding of a clinical trial.


More in ClinTech

With the growing complexity of study design; the multiplying numbers of different trial technologies employed; and the ever-increasing globalization of trials, today's clinical trials create vast volumes of data. Given the fundamental importance of study management; data management; and logistics information to every trial, this article will explore how to approach and define information requirements. Theory and practice that has been around as far back as the 17th century will be used, in a manner that will enhance understanding of the trial performance, help identify areas requiring action, and accurately measure the outcome of those actions.

Navigating a sea of data

Paul Allen, the cofounder of Microsoft, once said "for me, goals and daily metrics are the key to keeping me focused. If I don't have access to the right stats, every day, it is so easy for me to move on mentally to the next thing. But if I have quick access to key metrics every day, my creativity stays within certain bounds—my ideas all center on how to achieve our goals."

It is more critical than ever to quickly assimilate mammoth amounts of data to identify the key information required to facilitate effective conduct of a study, identify action areas, and share best practices. As a consequence, many companies are creating data warehouses to minimize the hunt for data in disparate places.

The solution is not as simple as confederating all data into a single location and expecting it to become clear. Simply presenting information in data reports at a low, granular level alone will not provide meaningful clarity. Instead, we need to create visual stories that help isolate patterns or trends, and provide links between seemingly disconnected information. The concept of visually presenting data is not a new one, Descartes invented the two-dimensional graph in the 17th century. In the late 18th and early 19th century many of the charts in use today were either improved upon, or invented, by William Playfair.

By presenting data visually, one is able to deliver metrics whose contextual insight and focus allow the user to appropriately decide which components to drill into for more concentrated granular content. This is essential if a user is not to be washed away in a tidal wave of information. This article is intended to discuss the thought process that goes into identifying a good metric as well as touching on some of the rules of visual display to ensure the story in the data is clearly and succinctly delivered.

What is a good metric?

There are a number of guiding principles that should be followed when identifying metrics.

  • It should be simple.

  • It should be a small set that is easily processed and understood. The manner in which the data is communicated is a key component.

  • It should be measured by someone who can act on results.

  • It should be used—why measure if no action will be taken as a result of having the data?

A number of metrics are often taken for granted as being essential to understanding the performance of a clinical study. Some examples include site initiation, time from site activation to the first shipment received at site, first patient first visit (FPFV), recruitment rate, and the time taken to resolve queries. While these are important for helping management see that studies are getting up and running as expected, we need to question whether these metrics truly help those actually responsible for ensuring the successful conduct of the trial. Many of these commonly sought metrics indicate simple statuses regarding what is happening in the study now, so-called "lagging indicators," with limited or no reference to how the study got there or where it might go in the future. Lacking is the contextual information that provides the metrics' critical value. For example, FPFV identifies that a site has commenced recruitment, but does not reveal if a site is actively recruiting. More value would be obtained from the recruitment rate or the window between recruitment events yielding further, meaningful information to allow effective management of the site.

Defining industry standards is still in its infancy, and there is an argument that off-the-peg metrics may not be optimal metrics, if used in isolation. However, some valuable information is available to help formulate the thinking as outlined by Dave Zuckerman.1 He defined processes by which metrics can be identified and disseminated throughout an organization; to date it is the only book published on this topic within the pharma industry. More recently, the Metrics Champion Consortium (MCC)2 has been developing performance metrics across labs, ECG, imaging, and clinical trial conduct in a collaborative effort across biotechnology, pharmaceutical, medical device, and service provider organizations. Other sources of information include the CMR International2010 Pharmaceutical R&D Factbook.3

At minimum, metrics should span four areas:

  • Cycle time—a measure of how long an activity takes to complete.

  • Timeliness—a measure of whether an activity was completed on time (i.e., a milestone was met).

  • Quality—a measure of the number of errors in a completed task.

  • Cost (or efficiency)—a measure of the resources required to complete the task.

Market research indicates that within this industry the majority of metrics quoted as being in common use appear to be focussed on cycle time and timeliness. Without the balance of quality and cost, a true picture of performance cannot be obtained. Anything can be done fast and on time if there is no regard for the cost or extent of errors.

It is also possible that over the duration of the study the importance and role of a metric may change, and the following section addresses optimal ways to approach this evolution along the study life cycle.

Metrics in the workflow

Starting with the information available from the above sources is a good foundation. However, as the study life cycle moves across the varying stages of inception; design; start-up; subject recruitment; study conduct; close out; and submission, the value and prominence of given information also changes over time. For example, once all countries and sites are initiated, focus may shift to recruitment, and likewise after all subjects are recruited, focus may shift to data management or retention. This is not to say that start-up information should not be available to the user once that activity is completed, but rather that a user is likely to be more interested in other information. Targeted information addressing the user's primary interest should be the first item that is presented to the user. Defining this process and matching the metric to the current activities within the study help users see what is most important at that particular time.

The best method for identifying the most important metrics is directly asking the users what information is needed to do their job. There are a multitude of roles or "user personas" involved in a clinical trial including clinical study leader, clinical supply manager, statistician, country manager, CRA/monitor, and site personnel. Intimately understanding these roles and the activities each role is required to perform is key to accurately determining users' informational requirements. In this process, focusing on the underlying question of "what problems are you trying to manage?" rather than "what data do you need?" allows a framework for identifying pertinent and valuable metrics. Metrics for a given role or individual should be empowering. They must be motivated to have a positive impact on the metrics by modifying their behavior in the progress towards a defined goal.

Consider the different user personas that have an interest in aspects of the data management process. For example, site-based personnel are interested in outstanding queries raised by monitors as well as the number of incomplete or outstanding eCRFs which dictate their activities. Although monitors will have a vested interest in incomplete forms and unanswered queries for their sites, they are also interested in the trend of queries being raised; the number of answered queries; and completed forms, as this drives the amount of activity they have to perform (Figure 1).

Similarly, data managers focus on the answered queries and completed forms, as this impacts their workload. Study managers are interested in the query trends with upward and/or downward patterns that influence where they may place additional resources. Such trending information may also provide a visible outcome measure of actions already taken such as training interventions (Figure 1).

The above examples highlight the real value of adding contextual trend information to the metric to help users rapidly identify whether or not performance is improving. Therefore, the ongoing monitoring of the metric is as important as the metric itself.

When considering "what is a good metric?" it is important to note that there are often a number of alternative ways of displaying the same data, each illuminating a different aspect of the story (Table 2).

A further example regarding EDC queries is detailed below. This also allows us to explore the concepts of effective and ineffective work.1 Effective work materially contributes to the outcome, whereas ineffective work does not (e.g., waiting for something to happen).

Displaying the number of queries at a site indicates how much work needs to be done at that site, helping determine visit frequency or resources required. Displaying the number of queries at a site as a proportion of the number of subjects at that site provides an indicator of how "dirty" the site is. A high number of queries with a high number of patients is perhaps more acceptable than a high number of queries with a low number of patients. This value may indicate a need for further investigation at the site as to why this is the case. For example, does the site require additional training?

In the above scenario, the number of queries is an example of ineffective work. As it is a measure of rework, it would be better to never have the queries in the first place. Much attention is given to how long it takes to resolve queries. Instead, if that effort were focused on ensuring queries did not happen in the first place (i.e., efforts to ensure a downward trend in number of queries raised over time, as a proportion of the number of subjects), the rework would not be needed. In this case, a quality metric rather than a cycle time metric would focus efforts on affecting a change in effective work, resulting in a greater benefit for the organization. The total amount of time spent resolving queries, or doing ineffective work, would also decrease since there were fewer queries raised in the first place.

Value of consolidated metrics

By consolidating data from multiple sources and presenting it in a single location, new valuable metrics can surface. For example, using real time information from a randomization and trial supply management system regarding when subject visits have occurred, makes it possible to compare the number of expected eCRFs against the number of completed or partial eCRFs (Figure 2). This offers an additional metric to facilitate user workflow while pointing to the data visibility gap, namely the lag time between actual visits and entering visit data into the eCRFs. Providing this information to monitors gives additional insights to better manage the performance of the site prior to monitoring visits.

Role of technology

Having defined the content of your metrics and data reports, it is important to appropriately display the information. There are many best-practice methodologies that can be employed to ensure intuitive and informative display of data. The Gestalt Principles of visual perception as well as the importance of user interface design for data display are instrumental in ensuring a user can absorb as much information as possible as quickly as possible.

Seminal papers published as far back as 1956 illustrate that we tend to remember seven (plus or minus two) pieces of information at any one time.4 This remains one of the key data presentation principles for delivering meaningful data to users (although more recent work suggests this number may be as low as three pieces of information). Pre-attentive processing adds value to the data display. The use of color, grounding, closure, size, and contrast can aid the interpretation of data. These factors serve important substantive purposes beyond simply creating a visually attractive appearance. Starting his work in the 1970's, Edward Tufte is considered one of the fathers of information design. However, it has provenance back to the 1800's with John Snow's spot maps of a cholera outbreak in London in the 1850s,5 leading to the discovery that the disease is spread by contaminated water; and Charles Joseph Minard's 1869 flow chart on Napoleon's Russian campaign of 1812.6 One concludes the concept of metrics has been around for a long time, but has not been applied to the clinical trial industry with significant rigor. In fact in many ways the pharma industry is behind other areas as there are many examples of the use of metrics in any number of industries (i.e., the military,7 supply chain,8 and policing9 ).

It is important to select the correct visual display mechanism in order to best display the information. There are rules to determine whether a chart or table is the most appropriate way to display a particular piece of information.10

The value of tables should not be overlooked when identifying which manner should be used to display data. Not everything has to be displayed in a chart, despite the current trends toward overly graphical dashboard displays. Tables should be used when the user persona is interested in "looking up" individual values or comparing pairs of related values. Examples include numbers of subjects in treatment, number of sites activated, and the number of CRFs incomplete. Tables should also be used when the user requires a level of precision that a graph cannot easily provide (e.g., data to two-decimal places).

Charts convey a significant amount of data within a small space.11 They are not good for reporting absolute values; charts reveal the shape of the data allowing identification of trends, comparisons, exceptions, similarities, and differences between data. If the user persona uses these key words, a chart should be considered for displaying data. It is also useful to note that charts and tables can be used together to complete a picture. For instance, tables showing current status of variables and charts showing more about the relationships within the data can help provide a holistic snapshot.

If information is displayed using a chart, the user should be provided with a mechanism by which they can access the underlying data. Such mechanisms include downloading Excel sheets with tabular display of charted data, linking to a data report, or toggling between chart and table in cases where both are not displayed on the same screen.

It is easy to be overwhelmed with the functionality available within many of the business intelligence tools in the market place (i.e., animation, and the large numbers of colors and chart possibilities). As a result, it is easy to fall into a number of traps or poor chart design, such as the use of 3D, sheen, the wrong chart type, or simply being inconsistent with charting implementations12 (Figure 3).

Understanding the various chart types and the relationships they are best suited to displaying data is critical to successful design. The trick is to be selective about what functionality to choose and not let the technology overwhelm the message. Creation of design standards to enhance consistency and design choice can help control the technology rather than allowing technology to run amok.


  • To derive salient measures that truly inform decision making, it is essential to reevaluate and challenge common industry metrics.

  • Providing information about trends in addition to the current position is powerful in decision making; it not only shows how you got here but also where you may end up.

  • One set of metrics does not fit all since different user personas have different needs. Although core similarities exist, understanding the differences between individual personas is even more critical.

  • Consolidating data from multiple technologies provides value that cannot be obtained from individual databases.

  • Information design is key to ensure users use information efficiently and effectively.

The consolidation of data and creation of new and powerful metrics for a user outlined in this article are not possible without the underlying technology infrastructure. Well-defined data architecture strategies coupled with enabling integration/convergent technologies are essential to ensure ease of access to data as well as a robust enterprise reporting tool to effectively display the information. Blending cutting-edge technologies while helping users achieve their business objectives through intelligent use of data and metrics is the key to a successful reporting strategy.

Nikki Dowlman, PhD is Director of Product Management, at Perceptive Informatics, Nottingham, UK, e-mail: [email protected].


1. Dave Zuckerman, Pharmaceutical Metrics: Measuring and Improving R&D Performance, (Gower Press, 1996).

2. Metrics Champion Consortium (MCC),

3. CMR International, CMR International 2010 Pharmaceutical R&D Factbook, (Thomson Reuters, 2010).

4. G. A. Miller, "The Magical Number Seven, Plus or Minus Two," Psychological Review, 63 81-97 (1956).

5. Broad Street Cholera Outbreak, 1855 Study by John Snow,

6. Charles Joseph Minard's 1869 flow chart on Napoleon's Russian campaign of 1812,

7. Molly Merrill, "Decision Support System Aids Military During Patient Transfers," Healthcare IT News, April 15, 2009,

8. Powersim Solutions, "Case Study: Nestle,"

9. Jane's police review 6th Aug 2010 pg 42, Viewpoint – Intelligent Policing

10. Stephen Few, Information Dashboard Design: The Effective Visual Communication of Data, (O'Reilly, 2006).

11. Stephen Few, "Data Visualization for Human Perception," (2010),

12. Perceptual Edge,

Related Content
© 2024 MJH Life Sciences

All rights reserved.