Syed Moin Hassan, MD

(He/Him/His)

Rank

Instructor

Department

Medicine

Sleep and Circadian Disorders

Authors

S. Hassan, MD*¹, R. Mylvaganam, MD², T. Didebulidze, MD³, M. Khalid, MD³, Pietro Nardelli, PhD⁵, G. Piazza, MD, PhD⁴, R. San Jose Estepar, PhD⁵, M. J. Cuttica, MD⁶, G. R. Washko, MD¹, F. N. Rahaghi, MD, PhD¹ 1. Division of Pulmonary and Critical Care, Brigham and Women's Hospital, Harvard Medical School, US 2. Division of Pulmonary and Critical Care, Northwestern Memorial Hospital, US 3. Department of Medicine, Mass General Brigham at Salem Hospital, US 4. Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA 5. Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. 6. Division of Pulmonary and Critical Care, Northwestern University Feinberg School of Medicine, US

Principal Investigator

F. N. Rahaghi, MD, PhD

Twitter / Website

Categories

Leveraging Hybrid Natural Language Processing Techniques for Large-Scale Pulmonary Embolism Identification: Development and Validation of an Iterative and Novel Machine Learning and Rule-Based Pipeline

Abstract

Introduction: Pulmonary embolism (PE) is associated with high morbidity and mortality. Efficiently identifying PE cases from radiology reports is crucial for large-scale epidemiological research and clinical decision support.
Methods: This retrospective study developed and validated a hybrid natural language processing (NLP) pipeline. Four machine learning models (Logistic Regression, Random Forest, SVM, Naive Bayes) were trained and fine-tuned on 1,040 reports. The best-performing fine-tuned SVM model was then refined using a rule-based NLP approach on an external dataset of 49,611 reports spanning from 2010 to 2022. The rule-based NLP algorithm utilized regular expressions to identify specific phrases indicating the absence or presence of PE, aiming to improve the model’s specificity and sensitivity.
Results: The fine-tuned SVM model achieved an accuracy of 91.3% and AUC of 0.93. When deployed on the external dataset, the model’s real-time performance was assessed using 500 randomly selected reports, yielding an accuracy of 84.5%. Subsequent rule-based NLP iterations, focusing on identifying key phrases, improved the model’s specificity from 77.8% to 93.2% and PPV from 72.5% to 93.0%, reducing false positives by 1,591 cases. The final model demonstrated high sensitivity (96.4%), specificity (93.2%), and accuracy (94.8%).
Discussion: The hybrid NLP approach’s adaptability to different radiology reporting styles, differences in cohort diversity, gender and age enhances its potential for generalizability across various datasets. By fine-tuning the model using a targeted rule-based NLP algorithm, it can be optimized for specific dataset characteristics, requiring less training data and manual annotation compared to pure machine learning models. With further validation, such techniques could enable large-scale research on PE epidemiology, risk factors, and outcomes across institutions.