OR WAIT 15 SECS
With industry facing technological change, partnering may offer safety solutions.
Cardiac safety is, without a doubt, a very critical issue for the pharmaceutical industry. It is well known that some drugs may increase the risk of arrhythmias or other critical cardiac events. Therefore, industry spends a significant amount of effort in the early stages of drug development to ensure cardiac safety.
(STOCKBYTE, GETTY IMAGES)
However, there are no perfect biomarkers for cardiac safety, and the risk is high for killing drugs before their benefits are clearly evaluated.
Some years ago, the U.S. FDA and European Medicines Agency published their recommendations for ECG QT/QTc studies (Thorough QT studies, or TQT) to evaluate the proarrhythmic potential of non-antiarrhythmic drugs.1 The basic assumption behind these recommendations is that the QT-interval is a sufficient biomarker for the arrhythmogenic potential of drugs.
The QT interval represents the ventricular depolarization and the following repolarization of the heart cycle. The QT-interval is commonly measured from the beginning of the QRS complex (Q-Point) to the end of the T-wave. But the threshold level of regulatory concern is around five milliseconds, as evidenced by an upper bound of the 95% confidence interval around the mean effect on QTc of 10 milliseconds.1
This requires very accurate acquisition and analysis of the 12-lead surface ECG.
Measuring small effects of drugs on cardiac polarization and repolarization in the surface ECG with the necessary statistical relevance is not easy. All TQT studies, therefore, have to prove that the applied procedure is capable of detecting the small drug-induced effect on the QT interval.
TQT studies typically run a positive control group and placebo group in combination with the group treated with the drug under evaluation. Currently, each ECG is evaluated manually by experienced staff, which requires a significant budget for QT studies. Although humans are very good at pattern recognition, a manual analysis of ECGs introduces statistical errors due to inter-reader and intra-reader variability.
This problem has been well known since the mid-1980s, when a systematic evaluation of cardiologist in comparison to computer algorithms was done in the CSE study.2 Today, most ECG core labs are automatically tracking the variability introduced by their staff to ensure sufficient quality of the ECG analysis. Together with the positive and placebo control in TQT studies, this works well and with the necessary accuracy.
There have been several attempts to do the ECG analysis automatically, especially the QT measurement.3,4 But so far, no one seems to have proven well enough the accuracy of fully automatic procedures in TQT studies to persuade the pharmaceutical industry to take the risk of submitting a fully automated analysis to the FDA.
The reason for this lack of proof is that before new methods can be applied in real cases, they must be proven with existing data to be capable of delivering the same accuracy compared with the reference of a fully manual approach.
To date, however, these data have not been available for academic or commercial research institutions. Only recently, the THEW initiative,5 a public/private partnership under the leadership of the University of Rochester and FDA, has made an attempt to fill this gap by publishing data and conducting further research on automatic analysis methods.
The THEW project is a very promising start, but to date it is not sufficient to compare fully automatic procedures against a real gold standard. A real gold standard would mean to have ECG data sets—free of any commercial conflict of interest—made available to the public, where the reference (the key to the results) is nonpublic and held by an independent organization, such as the FDA.
These data sets should contain at least three TQT studies (with placebo and positive control) analyzed by different core labs and a nonprofit organization to ensure high data quality. Public and private organizations could then run the data sets with their automatic measurement methods and send their analysis to the organization holding the key (i.e., the expected results). The methods would then be proven independently and free of commercial interest.
This general procedure was already established for standard ECG measurement and interpretation algorithms in the 1980s (CSE project) and today is part of the required norms for ECG machines such as the IEC 60601-2-25.2,6
We have run our own ECG algorithms against several data sets and can prove that QT measurements are as accurate as manual measurements with less variability.
However, without a general, widely accepted reference against which all core labs and manufacturers have to prove their accuracy, industry will not take the potential risk to move forward with innovative, automatic procedures, despite offering tremendous efficiency gain and cost savings.
Besides validating automatic analysis of TQT studies, there is a fundamental question about the quality of QT as a biomarker. The association of long QT and the arrhythmogenic risk is well proven, but there is no one-on-one, direct relationship. QT is an indicator, but the predictivity is limited. This may lead to drugs killed in the early development stage, although their benefit would easily justify the risk (to date only estimated with a nonperfect biomarker).
The problem is well known, and several public and private organizations, including ours, are working on new ECG-based biomarkers—with very promising ideas.
For example, Couderc et al.,7 Khawaja et al.,8 and others9 have focused on T-wave morphology analysis and, not surprisingly, it seems there is significantly more information in the T-wave than just within the QT-Interval. However, most of the work is being done on very small data sets; the majority of critical arrhythmias, such as TdP (Torsade de Pointe), occur very rarely and are difficult to collect in larger data sets.
To prove and validate new biomarkers, the whole industry—ECG core labs, pharma, and public research organizations—will have to work together and contribute meaningful data to share among the different parties, as recently stated in the THEW project.5
Besides TQT studies, many drug safety and efficacy trials apply ECGs to monitor patients. While a systematic QT evaluation is usually not part of these studies, the ECG is still used to ensure safety by applying a defined set of evaluations. These are commonly provided by the sponsor. In these trials, the number of ECGs to be evaluated is usually significantly higher than in standard TQT studies.
The intention of these trials is to detect any changes in the ECG during the trial relative to a baseline ECG. The current state-of-the-art is a central, digital ECG collection with manual over-read by experienced core lab staff. Again, this procedure is costly and consumes a lot of resources. In a Phase III trial, there can easily be 20,000 to 50,000 ECGs.
The interpretation of ECGs after the recording in central core labs takes typically between 12 and 48 hours, depending on the requirements of the sponsor. Usually, the shorter the time, the higher the charge by the ECG core lab. However, a critical change against the baseline ECG means there will be quite a long time before a patient is reevaluated at the trial site and is withdrawn from the study, if necessary.
We have recently developed a new procedure to overcome these shortcomings. The intention is to increase cardiac safety of patients in a clinical trial by reducing the response time in the core lab. A secondary effect is that the cost of large ECG trials can be reduced significantly.
The basic idea behind this new procedure is that all incoming 12-lead resting ECGs are immediately, and in real-time, analyzed with our automatic measurement and interpretation algorithm HES10 on our core lab server.
In the first step, the algorithm classifies the ECGs into normal and abnormal. As classification criteria, we apply approximately 150 parameters of the ECG, including measurements, a detailed rhythm analysis, and a QRST evaluation. The 150 parameters were derived from approximately 1.200 parameters, the algorithm calculates for each 12 lead, 10 second resting ECG.
The algorithm was specifically trained to sponsor-defined thresholds, such as QTcB > 450 ms, HR > 100 (tachycardia), and HR < 58 = bradycardia, among others. The algorithm itself is trained so that its sensitivity to detect abnormal ECGs is extremely high—near 100%. This sounds difficult, but if any abnormality, when not even critical, is already classified as abnormal, it's possible to achieve the necessary sensitivity.
In the second step, the ECG is compared to the baseline ECG of the patient under evaluation (if present already) and analyzed for any change, for example, in the PQ, QRS or QT interval (serial ECG comparison). If the sponsor defines a change in QTc of more than 10 milliseconds as a clinically significant change, and 11 milliseconds are measured, the ECG is immediately classified as abnormal, even if the absolute value of QTc is still in the normal range.
Any ECG classified as abnormal is then, in the third step, prioritized according to clinical significance of measured changes. We define up to five levels of clinical significance. The highest level of significance is over-read by core lab staff immediately and sent back to the trial site, say within 20 minutes. The lowest level, which means abnormal but not clinically significant, may take up to 48 hours, depending on the requirement of the sponsor.
In conclusion, ECGs are not, as is common practice, always analyzed within a certain time frame—regardless of the clinical severance—but in the clinically appropriate time frame. This leads to significantly increased patient safety. In the event of a critical ECG, it is analyzed immediately to make sure that the patient and doctor do not have to wait 12 to 48 hours for feedback.
We have tested the new procedure with retrospective ECG data (more than 10,000 ECGs) from a Phase III trial analyzed in our core lab, and were able to detect all abnormal ECGs. Since the algorithm was trained to a high sensitivity, it classified many ECGs as false positive—that is, ECGs were classified as abnormal, although the lab classified them as normal. But since all abnormal ECGs were detected, this led us to conclude that in the future we could even analyze all ECGs classified by the algorithm as abnormal only and trust that if the algorithm classifies ECGs as normal, we don't have to send them for manual over-read.
In our example Phase III trial, we could have saved more than 50% of the manual over-read without any loss of patient safety. This would have saved the sponsor a lot of money. [Detailed results of our study will soon be published.]
Although the cardiac safety industry has talked for some time about a fully automatic ECG analysis approach to TQT, to date no methods have been proven well enough to persuade the pharmaceutical industry to apply them in a real trial. We need a concerted effort, across the whole industry and within academic institutions, to generate large enough data sets to validate fully automatic procedures.
To improve cardiac safety, we also need further private/public partnerships to foster research and development of new biomarkers besides QT, which is good but by far not a perfect risk-stratification marker. In non-TQT ECG trials we have reached a level of accuracy in automatic analysis to apply it, after a few more validation trials, in real trials.
Tosja K. Zywietz is head scientist, cardiac technology, Cardinal Health Research Services and general manager, BIOSIGNA, Lindwurmstrasse 109, 80337 Munich, Germany, email: firstname.lastname@example.org
1. ICH E14, EMEA, European Medicines Agency, 2005.
2. J.L. Willems, P. Arnaud, J.H. van Bemmel, P.J. Bourdillon et al., "Development of a Reference Library for Multi-lead ECG Measurement Programs," Journal of Electrocardiology, 20, 56-61 (1987).
3. R. Handzel, C. Garnett, M. Li, S. McNitt et al., "Computers in Cardiology," IEEE Computer Society Press, 35, 693-696 (2008).
4. Computers in Cardiology Conference, QT Challenge 2006, http://www.physionet.org/challenge/2006/.
5. THEW, Telemetric and Holter ECG Warehouse, http://thew-project.org/.
6. International Electrotechnical Commission, International norm: IEC 60601-2-51, Medical Electrical Equipment Part 2-51: Particular Requirements for Safety, including Essential Performance of Recording and Analysing Single Channel and Multichannel Electrocardiographs.
7. J.P. Couderc, M. Zhou, N. Sarapa, W. Zareba, "Investigating the Effect of Sotalol on the Repolarization Intervals in Healthy Young Individuals," Journal of Electrocardiology, 41, 595-602 (2008).
8. A. Khawaja, G. Butrous, O. Doessel, "Detecting Predisposition to Torsade De Points Using a PCA-Based Method," Computers in Cardiology, 33, 161-164 (2006).
9. C. Graff, J. Matz, M.P. Andersen, J.K. Kanters et al, Computers in Cardiology, IEEE, Computer Society Press, 35, 319-322 (2008).
10. C. Zywietz, D. Borovsky, G. Göttsch, G. Joseph, "Methodology of ECG Interpretation in the Hannover Program," Methods of Information in Medicine, 29, 375-385 (1990).