TY - JOUR
T1 - Generative AI-Powered Synthetic Data for Enhancing Predictive Analytics in Blood Donation Supply Management: A Comparative Study of Machine Learning Models
AU - Chee Hong, Koh
AU - Chee Ling, Thong
AU - Islam, Shayla
AU - AL Kolandaisamy , Raenu
AU - Shibghatullah, Abdul Samad
AU - Shahrol Nidzam , Nazirul Nazrin
AU - Sarsam, Samer
AU - Badioze Zaman , Halimah
N1 - International Journal on Advanced Science, Engineering and Information Technology (IJASEIT) publishes fully open access journals, which means that all articles are available on the internet to all users immediately upon publication. Non-commercial use and distribution in any medium is permitted, provided the author and the journal are properly credited. https://ijaseit.insightsociety.org/index.php/ijaseit/authorguidelines
Publisher Copyright:
© (2025), (Insight Society). All rights reserved.
PY - 2025/2/28
Y1 - 2025/2/28
N2 - Maintaining a sufficient and timely blood supply is an urgent and critical challenge in public health, where even minor miscalculations can lead to life-threatening shortages. This study evaluates the performance of machine learning models to improve blood donation forecasting. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) generated synthetic datasets that mirror real-world donation patterns to address data scarcity and variability issues. Leveraging transactional data from the Blood Bank Information System (BBISv2), a blood tracking system used by 22 main blood collection sites under the Ministry of Health (MoH) in Malaysia, 50 synthetic datasets were created and validated to ensure consistency with real data. The synthetic data showed minimal deviations from real data across key metrics, including mean (differences under 10%), variance (1 to 2 units), and skewness and kurtosis (0.03 or less). Among the models, the Random Forest algorithm demonstrated the highest performance, achieving an accuracy of 98.7%, a precision of 0.91, and an Area Under the Receiver Operating Characteristic (AUC-ROC) score of 0.92, making it the most reliable for predicting blood donation rates. Linear Regression also performed well, with an accuracy of 98.6%, while Neural Networks and Support Vector Machines (SVM) showed lower performance. This research provides a valuable tool for optimizing blood donation strategies, particularly in scenarios where real data is limited. Integrating validated synthetic data offers a novel approach for enhancing resource management in healthcare, ensuring reliable blood supply during high-demand periods.
AB - Maintaining a sufficient and timely blood supply is an urgent and critical challenge in public health, where even minor miscalculations can lead to life-threatening shortages. This study evaluates the performance of machine learning models to improve blood donation forecasting. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) generated synthetic datasets that mirror real-world donation patterns to address data scarcity and variability issues. Leveraging transactional data from the Blood Bank Information System (BBISv2), a blood tracking system used by 22 main blood collection sites under the Ministry of Health (MoH) in Malaysia, 50 synthetic datasets were created and validated to ensure consistency with real data. The synthetic data showed minimal deviations from real data across key metrics, including mean (differences under 10%), variance (1 to 2 units), and skewness and kurtosis (0.03 or less). Among the models, the Random Forest algorithm demonstrated the highest performance, achieving an accuracy of 98.7%, a precision of 0.91, and an Area Under the Receiver Operating Characteristic (AUC-ROC) score of 0.92, making it the most reliable for predicting blood donation rates. Linear Regression also performed well, with an accuracy of 98.6%, while Neural Networks and Support Vector Machines (SVM) showed lower performance. This research provides a valuable tool for optimizing blood donation strategies, particularly in scenarios where real data is limited. Integrating validated synthetic data offers a novel approach for enhancing resource management in healthcare, ensuring reliable blood supply during high-demand periods.
KW - Blood donation forecasting
KW - predictive analytics and visualization
KW - generative AI
KW - random forest
KW - public health
UR - http://www.scopus.com/inward/record.url?scp=105000197371&partnerID=8YFLogxK
U2 - 10.18517/ijaseit.15.1.20498
DO - 10.18517/ijaseit.15.1.20498
M3 - Article
SN - 2088-5334
VL - 15
SP - 9
EP - 19
JO - International Journal on Advanced Science, Engineering and Information Technology
JF - International Journal on Advanced Science, Engineering and Information Technology
IS - 1
ER -