ClinFormer: a multi-modal clinical transformer for explainable major adverse cardiovascular event prediction from electronic health records
Keywords:
Clinical Transformer, Multi-Modal HER, Attention Mechanism SHAP, Cross-Modal, Attention Contrastive Learning.Abstract
Background: Major adverse cardiovascular events (MACE), including acute myocardial infarction, stroke, and cardiovascular death, account for over 8 million deaths globally each year. Conventional prediction models such as Framingham Risk Score, SCORE2, and Pooled Cohort Equations rely on limited traditional risk factors and linear assumptions, restricting their ability to capture complex temporal and non-linear relationships within longitudinal electronic health records (EHRs). Methods: We propose ClinFormer, a multi-modal clinical Transformer designed to integrate five EHR modalities: laboratory results, diagnosis codes, medication records, clinical notes, and vital signs. The model employs cross-modal attention mechanisms with 12 attention heads and a model dimension of 512. ClinFormer was pre-trained using contrastive patient similarity learning on 127,438 patients from MIMIC-IV and externally validated on 38,924 patients from the eICU database. Model interpretability was provided through SHAP analysis and calibrated probability outputs. Results: On external validation, ClinFormer achieved an AUROC of 0.943 (95% CI: 0.937–0.949), significantly outperforming the strongest baseline model, ClinicalBERT (AUROC: 0.912; p < 0.001). Calibration performance was strong with an expected calibration error (ECE) of 0.031. SHAP analysis identified BNP, troponin I, and eGFR as the most influential predictors. Conclusions: ClinFormer provides accurate, interpretable, and well-calibrated MACE prediction directly from routinely collected EHR data, supporting its potential deployment in both resource-rich and resource-constrained clinical environments.
Published
How to Cite
Issue
Section
Copyright (c) 2026 Dr. Ramesh Murlidhar Bhatawdekar

This work is licensed under a Creative Commons Attribution 4.0 International License.