One of the most fascinating characteristics of artificial intelligence (AI) is its ability to “see” things that are invisible to the human eye. To recognize patterns where noise and random variation abound often seems something of a “magic trick”. To perform these “tricks”, machine learning (ML) algorithms analyze massive amounts of data and, employing complex statistical models and algorithms, use these data to learn how to perform a specific task without explicit instructions on how to do it. In healthcare, this disruptive technology is now being used in several settings, ranging from medical image analysis to predictive analytics.
Electrocardiography (ECG) is a commonly used diagnostic tool in patients with suspected acute pulmonary embolism (PE) assessed in the emergency department (ED). Although ECG findings are not specific to acute PE, they can help in the diagnosis of this condition. ECG abnormalities associated with acute PE may include sinus tachycardia, the S1Q3T3 pattern, right bundle branch block, and an array of other nonspecific changes. Of course, while ECG abnormalities may suggest the presence of acute PE, they are not diagnostic on their own, and other diagnostic tests, such as computed tomography (CT) angiography, are needed to confirm the diagnosis.
In their complexity, ECG signals have several features that make them amenable to the use of ML. They come in a standardized format, provide objective measurements, have several clear and distinct patterns, and can be collected quickly and non-invasively. It should, therefore, be no surprise that several groups are exploring the potential of ML to improve the diagnostic accuracy of the 12-lead ECG in patients with suspected PE. Beyond improved accuracy, the potential benefits of using these algorithms may also include greater speed and better cost-effectiveness in diagnosing this life-threatening condition.
In this issue of the Portuguese Journal of Cardiology, Silva et al. present their results on using AI to improve the diagnostic accuracy of the 12-lead ECG in the setting of suspected PE.1 Using a dataset of 1014 ECGs from patients admitted to the ED who underwent pulmonary CT for this reason, the authors developed and tested ML models to predict the presence of acute PE. The performance of the ML model was compared with the guideline-recommend clinical prediction rules (the Wells and Geneva scores), showing greater specificity (100%), and a sensitivity of 50%. Importantly, the ML model also outperformed the lesser-known Daniel ECG score, which uses the same input (12-lead ECG) for this purpose. So, the results are surely promising. However, there are important considerations that must be taken into account. First and foremost, there is a significant risk that this algorithm may not work as well as expected in populations that are different (sometimes even slightly) from the one from which it was derived. The authors made a significant effort to minimize overfitting and provide good measures of internal validation of the model. Nevertheless, in order to be used in clinical practice, external validation and a continuous reappraisal of the model's performance seem mandatory. The optimal use of this tool also remains to be determined. Should it be used alone or in combination with clinical scores? Will its relatively low sensitivity impact its usefulness?
Finally, as with any new technology, some ethical concerns arise. For example, there is the risk of over-reliance on ML algorithms, potentially leading to a reduction in the expertise and judgment of physicians. There is also the possibility of bias in the algorithms themselves, particularly if the data used to train them are not diverse enough. To address these concerns, it is important for ML algorithms to be developed and tested in the same populations in which they are to be employed, with a focus on ensuring that they enhance, rather than replace, human expertise in the diagnosis and treatment of acute PE. This requires collaboration between ML experts and healthcare professionals, and a commitment to ongoing evaluation and refinement of the algorithms over time. So, rather than a finish line, this study is more of a starting point from which great things may hopefully be built.
Conflicts of interestThe author has no conflicts of interest to declare.