Distilling knowledge from publicly available online EMR data to emerging epidemic for prognosis

Liantao Ma, Xinyu Ma, Junyi Gao, Xianfeng Jiao, Zhihao Yu, Chaohe Zhang, Wenjie Ruan, Yasha Wang, Wen Tang, Jiangtao Wang

    Research output: Chapter in Book/Report/Conference proceedingConference proceedingpeer-review

    20 Citations (Scopus)
    260 Downloads (Pure)


    Due to the characteristics of COVID-19, the epidemic develops rapidly and overwhelms health service systems worldwide. Many patients suffer from life-threatening systemic problems and need to be carefully monitored in ICUs. An intelligent prognosis can help physicians take an early intervention, prevent adverse outcomes, and optimize the medical resource allocation, which is urgently needed, especially in this ongoing global pandemic crisis. However, in the early stage of the epidemic outbreak, the data available for analysis is limited due to the lack of effective diagnostic mechanisms, the rarity of the cases, and privacy concerns. In this paper, we propose a distilled transfer learning framework, which leverages the existing publicly available online Electronic Medical Records to enhance the prognosis for inpatients with emerging infectious diseases. It learns to embed the COVID-19-related medical features based on massive existing EMR data. The transferred parameters are further trained to imitate the teacher model's representation based on distillation, which embeds the health status more comprehensively on the source dataset. We conduct Length-of-Stay prediction experiments for patients in ICUs on real-world COVID-19 datasets. The experiment results indicate that our proposed model consistently outperforms competitive baseline methods. In order to further verify the scalability of o deal with different clinical tasks on different EMR datasets, we conduct an additional mortality prediction experiment on End-Stage Renal Disease datasets. The extensive experiments demonstrate that an benefit the prognosis for emerging pandemics and other diseases with limited EMR.

    Original languageEnglish
    Title of host publicationThe Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021
    Editors Jure Leskovec, Marko Grobelnik, Marc Najork, Jie Tang, Leila Zia
    PublisherAssociation for Computing Machinery, Inc
    Number of pages11
    ISBN (Electronic)9781450383127
    Publication statusPublished - 19 Apr 2021
    Event2021 World Wide Web Conference - Ljubljana, Slovenia
    Duration: 19 Apr 202123 Apr 2021

    Publication series

    NameProceedings of the Web Conference 2021


    Conference2021 World Wide Web Conference
    Abbreviated titleWWW 2021

    Bibliographical note

    Funding Information:
    This work is supported by the National Natural Science Foundation of China (61772045), the Project 2019BD005 PKU-Baidu fund, and Peking University Medicine Seed Fund for Interdisciplinary Research (BMU2020MI010). WR is supported by ORCA PRF Project (EP/R026173/1).


    • Electronic Medical Record
    • Healthcare Informatics
    • Prognosis
    • Transfer Learning

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Software


    Dive into the research topics of 'Distilling knowledge from publicly available online EMR data to emerging epidemic for prognosis'. Together they form a unique fingerprint.

    Cite this