Hybrid gradient boosting with SMOTE-augmented feature engineering for high-accuracy cardiac arrhythmia detection: a comparative supervised machine learning study

Authors

  • Dr. Vaibhav Bhushan Tyagi ISBAT University, Kampala, Uganda.

Keywords:

Cardiac Arrhythmia Detection, ECG Classification, Supervised Machine Learning, Gradient Boosting, Feature Engineering, Hyperparameter Optimization.

Abstract

Background: Cardiac arrhythmias are a significant problem in the world and are the cause of around 15-20% of sudden cardiac deaths each year. Electrocardiogram (ECG) signal automated detection at the right time and place is still a major challenge in clinical practice because of signal complexity, inter-patient variation and significant class imbalance in clinical data sets. Objective: This study seeks to propose and test a supervised machine learning pipeline for the automated binary classification of cardiac arrhythmias based on multi-dimensional features extracted from the ECG, which involves gradient boosting classification, data augmentation using SMOTE, feature selection using SelectKBest and systematic hyper parameter optimization using 5-fold stratified cross-validated grid search. Methods: A total of 2,000 ECG samples (970 normal and 1,030 arrhythmic) were collected, pre-processed by Z-score normalization and mean imputation, and then selected the top 12 features from 20 candidate features using chi-squared feature selection. To deal with class imbalance, SMOTE was only employed on the training partition. 6 classifiers (Gradient Boosting, Random Forest, Support Vector Machine, Decision Tree, K-Nearest Neighbors, and Logistic Regression) were trained, tuned and benchmarked using the same experimental conditions. Results: The proposed Gradient Boosting model attained a classification accuracy of 95.8%, a precision score of 96.1%, a recall score of 95.4%, F1-Score of 95.7% and AUC-ROC of 0.989, which is an improvement of 1.6–11.6 percentage points compared to the other baselines. The ablation experiments showed that each of the pipeline stages was indeed a significant contributor to the overall performance and that the combination of SMOTE and hyper parameter optimization resulted in a 5.3% F1-gain compared to the baseline configuration. Conclusion: The proposed ECG arrhythmia detection framework shows competitive performance with recent state-of-the-art ECG classifiers and offers an interpretable and computational efficient method for clinically deployable arrhythmia detection. The pipeline is generalizable to other bio-signal classification applications, and is fully reproducible using open-source code.

Published

2026-01-30

How to Cite

Dr. Vaibhav Bhushan Tyagi. (2026). Hybrid gradient boosting with SMOTE-augmented feature engineering for high-accuracy cardiac arrhythmia detection: a comparative supervised machine learning study. Journal of Artificial Intelligence,Machine Learning and Neural Network , 6(1), 22–30. Retrieved from https://journal.hmjournals.com/index.php/JAIMLNN/article/view/6358

Similar Articles

<< < 5 6 7 8 9 10 11 > >> 

You may also start an advanced similarity search for this article.