How to Manage Machine Learning Algorithms in Clinical Trials

Aug 08, 2018
As today’s clinical trials grow increasingly complex, risk-based monitoring has become the watchword of the day. It is integral to any quality management program; it is mandated by regulatory agencies; and, critically, it is a potent tool for forestalling costly failures. Yet its implications of slow, methodical processes are at odds with another powerful pressure: the impetus for speed in a world in which even the slightest delay is punishingly expensive.
Fortunately, advanced technologies are now able to facilitate many aspects of risk-based monitoring, handling data from every part of a trial, organizing it, analyzing it—and instantly pinpointing anomalies and potential points of failure. While decidedly methodical, such processes are the opposite of slow; they speed actionable information to researchers, enabling informed decision-making in real time.
Of course, such technologies neither operate in a vacuum nor apply the same analysis to every project. To that end, risk-based monitoring (RBM) systems started utilizing machine-learning algorithms to quickly analyze data in real time and provide insights for decision makers. However, these algorithms need careful attention from clinical trial researchers so as to meet the needs of their specific trial, then interpret all results in light of that trial’s detailed design and research objectives.
Easily said, but what, exactly, does this mean?
Machine learning algorithms are designed to translate raw data into signals based on whatever parameters they have been programmed to follow. Those signals are presented in a human-readable form such as visualizations and actionable workflows. Depending on the available data and the algorithm’s configuration, the results they produce are often imperfect—consider the laughable products many shopping sites believe might interest you. In many situations these “false positives” make no difference—but in a clinical trial setting they may make the difference of life or death.
It is the researcher who ensures success.
Before implementing a risk-based management system, clinical trial researchers must carefully consider their trial. What, exactly, is it measuring? What is it not measuring; what can safely be ignored? Which potential patient results are to be expected, which make no difference, and which would clearly be a red flag?
Based on these insights, they can then design the algorithms to capture the most relevant information.
Five Considerations in Managing Outputs
No matter how advanced, machines still lack personal judgement, and researchers must remain actively involved throughout the process. The algorithm’s output is the starting point for decision-making, not the final deciding factor. To most effectively—and safely—harness machine learning, researchers should take a five-pronged approach to managing and vetting the outputs:
1. Examine the results carefully
Researchers must never take a result at face value. Again, the algorithm exists to help inform the researcher’s decision process, not to determine an automatic course of action.
Researchers might begin by asking whether the result even makes sense. Could it be classified as a false positive or negative? Why? Assuming that the result makes sense, is it pertinent to the trial? For instance, a patient’s elevated blood pressure may be directly material to a cardiac trial but may prove immaterial to an oncology study. 
Any given set of data can generate a wide range of insights, especially in a trial with multiple sets of data. Yet, not all these insights are valuable; only by considering results in light of their own specific objectives can researchers gauge whether or not any given data warrants taking action.
2. Follow the audit trail
Federal regulations demand that all actions related to data be monitored and auditable. Fortunately, trace logs associated with many machine-learning algorithms make such an audit trail simple to maintain as they automatically record each instance when an algorithm is created, has its configuration changed, and is executed. Scheduled notifications which alert the administrator when the algorithm begins, when it finishes, and if there are any problems during the run, can augment the record. 
While both tactics help cement a solid audit trail for FDA review, they also help researchers build their own trust in the algorithm’s outputs, ensuring that, when an algorithm uncovers an interesting patient result, the researcher can easily (and accurately) decide whether to examine the causes more closely. Such quality management therefore ultimately leads to better-informed decision making.
3. Continually monitor the algorithms’ output
Throughout the course of a trial, things change. Ongoing user interaction and feedback, paired with evolving thinking based on findings to date, may spur new questions, new theories—and a rationale for a new algorithm.
Still, not all change is intentional. Ongoing comparisons to known benchmarks can ensure the continued validity of the underlying algorithms. For instance, snapshots of the raw data at different time periods can compare recent findings with data which were used to train the algorithms, thus tracking the accuracy of well documented signals and providing assurance that no underlying issues have developed.
All this is to say, researchers cannot set up monitoring for a trial and then leave it to run; the algorithms require ongoing oversight and adjustment.
4. Leverage content-specific data libraries
No clinical trial exists in a vacuum—and research benefits from the ability to compare trial-generated data with a broader universe of data organized by drug class, therapeutic area, disease state, or even chemical structure. Machine learning-based data libraries help researchers view data—and verify its accuracy—through the lens of RBM; focus specifically on the signals generated by their algorithm; and utilize similar patients from similar sites as a baseline to quickly identify outliers that warrant closer investigation of potential safety or data accuracy issues—thus making smarter, faster, and safer decisions.
5. Ensure that the output is actionable, timely, and effective
Machine-learning algorithms are designed to enhance accuracy and speed decision-making by generating signals that help researchers track and monitor risks without needing to examine the data manually, point by point. If researchers are slowed, instead, by managing the system, they may still have greater insights, but miss the promised time-savings. Ideally, an RBM system will actually simplify data management for the researcher through customized options that enable them to both clearly define and clearly understand the algorithm results.
First, the system should enable researchers to configure both the algorithms themselves and the schedule on which they will run, so that users are not flooded with signals they can’t act upon—especially in the early stages of a trial. This includes using thresholds based on the quality of the signals as well as controlling the frequency of signal generation through scheduling.
The system should then present the information visually. The algorithms themselves can be complicated, and to interpret them correctly, researchers need to understand what they are measuring and in what framework. The system should contextualize data in a manner that is easy to interpret and to act on, generating signals through a combination of drill-down capabilities, statistical models, and intuitive data visualizations with built-in workflows.
Clinically Relevant Information in Real Time
Machine learning provides a crucial opportunity to shape and direct clinical trial research, helping harmonize and analyze millions of points of data in minutes, eliminate time-consuming point-by-point data review, flag anomalies, focus inquiries, inform decisions—and critically minimizing risks. Machine learning even accelerates the reporting of trial results by creating an audit trail in real time, thus eliminating the need to engage in tedious, time-consuming data cleaning at trial’s end.
The result: a safer clinical environment for patients, and a more effective, efficient clinical trial approach for sponsors.
Yet, the machines are only as effective as their algorithms and their interaction with human researchers. Researchers must put careful thought into the features of the system they select, then program it attentively and manage it actively throughout the course of the trial. By paying attention to the five-pronged approach outlined above, researchers can be certain that their risk-based monitoring system delivers on its promises. They can effectively employ machine learning algorithms to both manage the increasingly complex outputs of their clinical trials and make those trials move faster with less risk, reducing timelines and minimizing expenses. Those are outputs every sponsor can gladly embrace.
Dr. Basheer Hawwash, Principal Data Scientist, Remarque Systems, Inc.
lorem ipsum