Case Study Examines Two RBM Frameworks

Published on: 

This case study examines using centralized manual data review with statistical approaches to compare value and fit.

This case study examines a practical approach using an RBM framework to develop site risk profiles

Risk based monitoring (RBM) has been a widely discussed topic in clinical research in recent years but currently there is limited literature providing examples of practical application of techniques, in particular where an organization has limited resources for implementation of new approaches to trial oversight. The biopharma industry currently relies on extensive feet-on-the-ground monitoring to provide an assessment of site risk. Many organizations have been supplementing this with centralized manual (by eye) data reviews to identify patterns which may indicate where additional resources and effort are best placed, however this involves the sourcing and combination of multiple data sources, often found in hard copy or on traditional spreadsheets. This ad-hoc labor-intensive process is both prone to error and time consuming.

This exploratory case study illustrates how RBM can be applied using a specific technology solution from Algorics, called Acuity, and examines both data-driven models and centralized manual data review. The case study was performed in collaboration with Neuroscience Trials, a not-for-profit clinical research organization that specializes in neuroscience clinical research located in Melbourne, Australia. Since this was an exploratory exercise to compare the value and fit of different RBM approaches, data from a completed study in stroke recovery was used. The key objectives of this study were:

  • To confirm if monitoring data by applying risk based approaches can effectively characterize sites based on risks.

  • To identify if simple statistical models help in detection of risk at sites.

  • To perform data review with advanced software tools.



The clinical trial selected had the following profile:

  • Multicenter, Prospective, Randomized, Open label, Blinded Endpoint (PROBE), Phase II trial

  • Acute ischemic stroke patients presenting within 4.5 hours of symptom onset

  • All subjects received the standard care clot busting drug

  • Participants were then randomized (1:1) to either receive an Angiogram / intervention to remove the clot or a clot busting drug

  • Primary outcomes were reperfusion and neurological recovery (reduction in NIHSS score) at 24 hours

  • Secondary outcomes included level of disability at three months

  • 14 sites across Australia/New Zealand

  • Adaptive sample size of ~ 100 subjects

  • Subjects participated for three months duration

  • Total study duration of three years


The following steps were carried out once the study was selected:

  • The clinical trial protocol was provided to the Algorics team for review.

  • The study data for subjects at different sites were provided. The data were anonymized to ensure confidentiality and also to avoid bias during interactions between Algorics and Neuroscience Trials team members.

  • The protocol was reviewed by the Algorics team and inputs pertaining to the protocol were received from the Neuroscience Trials team. Factors that could be a potential risk to conduct of the study were identified. Additional key parameters related to what is reviewed from site monitoring were also factored in and finalized. Five important data points were considered and termed as risk parameters (in practice more may be needed but this was considered sufficient for the exercise).

  • Risk scoring process: Each risk parameter was given a score based on the degree of impact it could have on the outcome of study conduct. Details of the risk scoring process are as follows:

         Risk parameter

Observation score

Deviation - Stroke onset to clot buster

Less than 4.5 hrs


More than 4.5 hrs



Deviation - Stroke onset to groin puncture

Less than 6 hrs


More than 6 hrs


Unreported AEs based on 90 day conmed

No unreported issues


1 AE unreported


> 1 AE unreported


Non-congruence between NIHSS, mRS, Home time & AEs

No issues


Non-congruence issues







  • Creating data visualizations: Conventionally, Excel data reports are used for review of data, which is both cumbersome and time consuming for the end reviewer. In order to simplify the process, the Algorics team used Acuity, the company’s clinical analytics and integrated risk based monitoring technology solution, and mapped the risk parameters to data visualizations and thresholds for data review. Figure 1--at the end of the article--illustrates visualizations created for this case study.

Two approaches were used to review and investigate the data analysis:

  • Centralized manual review of data and derivation of a site risk score.

  • Data-driven modelling of specific parameters to provide insight into site performance and develop a risk profile.



Centralized manual data review

Data review: Data visualizations of risk parameters for each subject were reviewed from Acuity and output of the review was generated in a standardized data review sheet using the risk scoring guide. The risk score for each subject was obtained after adding up the observations for each risk parameter. This process ensured that observations were objectively and quantitatively captured at a subject level.

  • Deriving site risk score: The risk score of all subjects of a site were then totaled to obtain a total risk score of a site. It has been generally observed that higher number of subjects enrolled at a site can impact quality outcomes. This is broadly due to sites becoming stretched as numbers increase. This factor needed to be accommodated in the risk assessment and hence, a subject load factor (number of enrolled subjects) of 0.1 for each subject enrolled at the site was assigned to yield a “subject load factored site risk score”. The subject load factored site risk score was then divided by the number of subjects to yield “average risk score per subject” for that site (as depicted in Figure 2).

  • Classifying site risks: The perceived risk was obtained by applying the following thresholds on the average risk score per subject:
  • < 50th percentile – Low risk

  • > 51 to 89th percentile – Medium risk

  • > 90th percentile – High risk

  • Site risk output based on manual review: The output derived from the classification provided a view of potential risk of sites (Table 1).


Data driven analysis

Need for a data-driven exploratory study: The authors further explored if there is an opportunity to apply a data-driven model, which in combination with the output of site risk classification based on manual review could help in improving risk characterization of a site. In order to do so, key study data points were chosen that could provide insight into functioning of the site. In this clinical trial, the subjects who were enrolled had suffered ischemic stroke. From a treatment management perspective, it was important that they be treated as early as possible from onset of stroke. Given the various factors that can impact the time from stroke onset to hospital admission, the clinical management team advised that delay of treatment from the time of admission in the Emergency department (ED) to intervention (clot-busting agent or groin puncture time for clot-retrieval) would be a good surrogate indicator to understand a site’s propensity to treat their patients in optimal timeframes. It was also recognized that there could be various factors at play including hospital practices, study team involvement, investigator availability and patient illness-specific situations that can determine the time to treatment and or intervention. Nevertheless, the macro purpose was to determine if there were consistent delays while treating subjects at a site.


Data-driven model:

Two types of data-driven models were used to estimate subject management risk at a site.

  • Percentile model:

The Acuity Decision Factory module which allows data-driven models to be applied to study data was utilized. The time taken from being admitted to ED to administering the clot buster and ED to groin puncture (if randomized) were taken for all the subjects enrolled at a site and then classified into three grades of risk using 50th percentile and 90th percentile as thresholds. 1 point for < 50th percentile (ED to clot buster time of 0.7 hrs & ED to groin puncture time of 2.9 hours respectively), 2 points for 51 to 89th percentile (ED to clot buster time of 0.71 to 1.39 hrs & ED to groin puncture time of 1.81 to 2.89 hours respectively) and 3 points for >90th percentile (ED to clot buster time of > 1.4 hrs & ED to groin puncture time of > 2.9 hours respectively) at subjects across all sites.


  • k-means cluster model:

k-means clustering is a method to partition observations into clusters in which each observation belongs to the cluster with the nearest mean. The data pertinent to the aforementioned surrogate indicators was scaled in the statistical software R (R is free software designed for statistical computing) for comparison.

The model broadly categorized subjects into 2 types:

  • Subjects where treatment and/or intervention was performed in less time (Average ED to clot-busting agent - 0.4 hrs and Average ED to Groin puncture - 1.47 hrs)

  • Subjects where treatment and/or intervention was performed in more time (Average ED to clot-busting agent - 0.98 hrs and average ED to Groin puncture – 2.85 hrs)

The data for each category was totaled for all subjects at a site level to derive the incidence of subjects where the procedure was delayed. The incidence of delayed treatment and/or intervention across subjects provided the risk profile of the sites.



  • Centralized manual data review: Results are described in Table 1.


Table 1


Overall risk

































  • Data driven analysis:


  • Percentile method: Results are described in Table 2 and 3.


       Table 2                                                                                                                         Table 3

Site% RiskRisk classification
Site% RiskRisk classification



 (b) k-means cluster method: Results are described in Table 4.

Table 4

Sites% High riskRisk classification




  • The following are the key observations upon analysis of data through both approaches:
  • Consistency of risk classification (75% incidence and above) of sites across centralized manual data review and the two data-driven models: 70%

  • Consistency of risk classification of sites across centralized manual data review and at least one data driven model was 70%. (Sites C, E & J).

  • Consistency of risk classification across both data-driven models (k-means cluster and percentile method on time to groin puncture delay) was 80% (Sites A, B, C, D, E, G, H, I)

  • High risk and medium risk classified site in centralized manual data review that coincided with one data-driven model (Percentile method on groin puncture delay): 80% (Sites C, E, F, H & J)

  • Though all of the high risk sites categorized through centralized manual data review have been detected in one of the data-driven models (using groin puncture time delay as the basis), it was observed that the same degree of congruence was not seen in low risk sites derived from centralized manual review (50% of the sites with low risk in manual review coincided with the risk profiling of data-driven model which were in fact deemed to be high risk). When the inverse relationship was explored, it was found that 66% of low risk sites from data-driven model coincided with centralized manual review.

  • Another interesting observation was that Site A, which contributed more than 20% of the subjects to the study and despite being the highest enrolling site in this study, had consistently been shown as a low risk site across the models.



Validation of the exploratory analysis

Since the exploratory analysis was carried out independently of the study team members, it was important to ascertain if the output of the models aligned with assessment of site performance and risks per the site monitoring team. Qualitative inputs from the site monitoring operations team were taken based on their monitoring experiences during the course of the study. There was consensus that the observations from the exploratory analysis agreed with their opinion of the high and medium risk sites. It was also interesting to note that the monitoring team highlighted that the exercise had indeed detected the highest and lowest risk site (Sites J and A respectively).

Further, when the monitoring team was provided access to the data visualizations, they reported that a single complete subject centralized manual review required about 25 minutes for review that subsequently reduced to about 15 minutes after getting used to the modified review methodology. This supported the ease of use of a RBM agnostic technology solution in data review. In conventional centralized manual review which relies on spreadsheets, the same review process could have required at least an hour for review.


This exploratory exercise helped in drawing the following conclusions:

  • RBM adoption need not be a complex process. It can be made simple and accomplished by a combination of centralized manual data review and specific data-driven models.

  • Centralized manual data review when performed in an objective and risk based approach was able to detect risks at site. Application of this approach during source data review (SDR) can significantly help in continually assessing risk of sites based on monitor’s feedback.

  • Use of technology can significantly speed up centralized manual review.

  • Concerns about using data-driven models to assess risk can be resolved. This case study demonstrates that when data corresponding to specific important processes executed at sites that are critical to study outcomes is run through relevant data-driven models, it provides insight into how sites are functioning. This can be an important input to determine risk at sites proactively.

  • Centralized manual data review when complemented with simple data-driven statistical models could increase the likelihood of characterizing a site risk profile.

  • Risk based approaches to monitoring can be enabled through the use of clinical technology solution that helps in risk planning, data visualization and data-driven statistical modeling.


Abby Abraham is a co-founder of Algorics and serves as Vice President, Clinical Solutions. The author wishes to thank Tina Soulis and the team at Neuroscience Trials, Australia for participating and providing scientific inputs that helped in building this case study.