Skip to main navigation Skip to search Skip to main content

A Multidimensional Framework for Data Quality Assessment in Heart Failure: Integrating IEEE 2801-2022 and Fairness Metrics

  • Marina Georgoula
  • , Grigorios G. Kotoulas
  • , Konstantina-Helen Tsarapatsani
  • , Dimitrios G. Boucharas
  • , Ioannis Kyprakis
  • , Dimitrios Manousos
  • , Andrej Preveden
  • , Lazar Velicki
  • , Amy Groenewegen
  • , Frans Rutten
  • , Borut Flis
  • , Matej Pičulin
  • , Peter Vračar
  • , Zoran Bosnić
  • , Maria Tafelmeier
  • , Lars S. Maier
  • , Fausto Barlocco
  • , Iacopo Olivotto
  • , Marta Jimenez-Blanco
  • , Jose Luis Zamorano
  • Duncan Edwards, Prithwish Banerjee, Nduka C. Okwose, Sarah Charman, Djordje G. Jakovljevic, Manolis Tsiknakis, Dimitrios I. Fotiadis
    • Biomedical Research & Training Institute
    • Institute of Computer Science - FORTH
    • University of Novi Sad
    • Institute of cardiovascular diseases of Vojvodina
    • University Medical Centre Utrecht
    • University of Ljubljana
    • Universität Regensburg
    • University of Florence
    • University Hospital Ramón y Cajal
    • University of Cambridge
    • University Hospitals Coventry and Warwickshire NHS Trust
    • Newcastle University
    • Newcastle upon Tyne NHS Hospitals Foundation Trust
    • Hellenic Mediterranean University
    • University of Ioannina

    Research output: Chapter in Book/Report/Conference proceedingConference proceedingpeer-review

    Abstract

    Heart failure (HF) affects over 64 million people globally and poses complex diagnostic and therapeutic challenges. Reliable clinical research in HF hinges on high-quality data. This study presents a novel data quality assessment (DQA) framework tailored to retrospective HF datasets. It adapts the IEEE standard 2801-2022 criteria—originally for general medical data—to HF's clinical and multimodal structure and introduces a fairness-aware dimension to assess demographic representativeness. Applied to a real-world dataset of 6,039 patients and over 110,000 records across 11 clinical domains, the framework evaluates six dimensions: Completeness, Accuracy, Consistency, Compliance, Timeliness, and Fairness. Initial completeness was low (48.82%), but improved to 61.04% after cleaning via outlier correction, imputation, and schema normalization. Accuracy and compliance reached 100%, and consistency improved to 99.61%. Fairness, measured via JensenShannon Similarity across age, sex, and BMI, remained at 87.35%, highlighting demographic imbalance remained unresolved by technical cleaning. This is the first standards-aligned, domain-adapted, and fairness-extended DQA pipeline for HF, producing a robust dataset suitable for machine learning and clinical decision support.
    Original languageEnglish
    Title of host publication2025 IEEE 25th International Conference on Bioinformatics and Bioengineering (BIBE)
    PublisherIEEE
    Pages456-463
    Number of pages8
    ISBN (Electronic)979-8-3315-5899-4
    ISBN (Print)979-8-3315-5900-7
    DOIs
    Publication statusE-pub ahead of print - 11 Dec 2025
    Event 25th International Conference on Bioinformatics and Bioengineering - , China
    Duration: 11 Aug 202513 Aug 2025

    Publication series

    Name2025 IEEE 25th International Conference on Bioinformatics and Bioengineering (BIBE)
    PublisherIEEE
    ISSN (Print)2159-5410
    ISSN (Electronic)2471-7819

    Conference

    Conference 25th International Conference on Bioinformatics and Bioengineering
    Abbreviated titleBIBE
    Country/TerritoryChina
    Period11/08/2513/08/25

    Bibliographical note

    Publisher Copyright:
    © 2025 IEEE.

    Funding

    Research supported by the STRATIFYHF project, which has received funding from the European Union's H2020 research and innovation program under grant agreement No 101080905. This article reflects only the authors' views. The European Commission is not responsible for any use that may be made for the information it contains.

    FundersFunder number
    European Commission
    Horizon Europe101080905

    Keywords

    • Clinical Decision Support
    • Data Cleaning
    • Data Quality Assessment
    • Heart Failure
    • Retrospective Clinical Data

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Biomedical Engineering
    • Health Informatics
    • Radiology Nuclear Medicine and imaging

    Fingerprint

    Dive into the research topics of 'A Multidimensional Framework for Data Quality Assessment in Heart Failure: Integrating IEEE 2801-2022 and Fairness Metrics'. Together they form a unique fingerprint.

    Cite this