Convolutional Neural Network for CORE Sections Identification in Scientific Research Publications

Bello Aliyu Muhammad, Rahat Iqbal, Anne James, Dianabasi Nkantah

    Research output: Chapter in Book/Report/Conference proceedingConference proceedingpeer-review

    1 Citation (Scopus)

    Abstract

    The overwhelming volume of data generated online continuous to grow at an exponential rate. Over 80% of such data is unstructured. Scientific research publications constitute a significant portion of such unstructured data. Systematic literature review (SLR) is a rigorous and challenging process. The key challenge in SLR activity is the automatic extraction of the relevant data from the volume of publications. Lack of a unified framework has been identified as the key problem. A canonical model, based on the structure of the papers was proposed for data extraction purposes in SLR. Implemented as a classification problem, traditional machine learning models were used to realize the canonical model. A good accuracy was reported in these traditional models. However, there is room for improvement. This paper presents the result of the investigation of the same problem using convolutional neural network (CNN), which is more sophisticated (deeper). The results show an improvement over the traditional machine learning models with an accuracy of 85%. Unlike the previous CNN NLP works, this work also demonstrates the application of CNN on a bigger NLP dataset such as the data from the scientific research publications. The result also shows that the CNN performs even better in NLP tasks with bigger datasets.
    Original languageEnglish
    Title of host publicationIntelligent Data Engineering and Automated Learning – IDEAL 2019
    Subtitle of host publication20th International Conference, Manchester, UK, November 14–16, 2019, Proceedings, Part I
    EditorsHujun Yin, David Camacho, Peter Tino, Antonio J. Tallón-Ballesteros, Ronaldo Menezes, Richard Allmendinger
    PublisherSpringer International Publishing
    Pages265 - 273
    Number of pages9
    Edition1
    ISBN (Electronic)978-3-030-33607-3
    ISBN (Print)978-3-030-33606-6
    DOIs
    Publication statusPublished - 14 Nov 2019
    Event20th International Conference on Intelligent Data Engineering and Automated Learning - Manchester, United Kingdom
    Duration: 14 Nov 201916 Nov 2019
    Conference number: 20th
    http://www.confercare.manchester.ac.uk/events/ideal2019/

    Publication series

    NameInformation Systems and Applications, incl. Internet/Web, and HCI
    Volume11871
    NameLecture Notes in Computer Science
    Volume11871
    ISSN (Print)0302-9743

    Conference

    Conference20th International Conference on Intelligent Data Engineering and Automated Learning
    Abbreviated titleIDEAL 2019
    Country/TerritoryUnited Kingdom
    CityManchester
    Period14/11/1916/11/19
    Internet address

    Fingerprint

    Dive into the research topics of 'Convolutional Neural Network for CORE Sections Identification in Scientific Research Publications'. Together they form a unique fingerprint.

    Cite this