Abstract
The overwhelming volume of data generated online continuous to grow at an exponential rate. Over 80% of such data is unstructured. Scientific research publications constitute a significant portion of such unstructured data. Systematic literature review (SLR) is a rigorous and challenging process. The key challenge in SLR activity is the automatic extraction of the relevant data from the volume of publications. Lack of a unified framework has been identified as the key problem. A canonical model, based on the structure of the papers was proposed for data extraction purposes in SLR. Implemented as a classification problem, traditional machine learning models were used to realize the canonical model. A good accuracy was reported in these traditional models. However, there is room for improvement. This paper presents the result of the investigation of the same problem using convolutional neural network (CNN), which is more sophisticated (deeper). The results show an improvement over the traditional machine learning models with an accuracy of 85%. Unlike the previous CNN NLP works, this work also demonstrates the application of CNN on a bigger NLP dataset such as the data from the scientific research publications. The result also shows that the CNN performs even better in NLP tasks with bigger datasets.
Original language | English |
---|---|
Title of host publication | Intelligent Data Engineering and Automated Learning – IDEAL 2019 |
Subtitle of host publication | 20th International Conference, Manchester, UK, November 14–16, 2019, Proceedings, Part I |
Editors | Hujun Yin, David Camacho, Peter Tino, Antonio J. Tallón-Ballesteros, Ronaldo Menezes, Richard Allmendinger |
Publisher | Springer International Publishing |
Pages | 265 - 273 |
Number of pages | 9 |
Edition | 1 |
ISBN (Electronic) | 978-3-030-33607-3 |
ISBN (Print) | 978-3-030-33606-6 |
DOIs | |
Publication status | Published - 14 Nov 2019 |
Event | 20th International Conference on Intelligent Data Engineering and Automated Learning - Manchester, United Kingdom Duration: 14 Nov 2019 → 16 Nov 2019 Conference number: 20th http://www.confercare.manchester.ac.uk/events/ideal2019/ |
Publication series
Name | Information Systems and Applications, incl. Internet/Web, and HCI |
---|---|
Volume | 11871 |
Name | Lecture Notes in Computer Science |
---|---|
Volume | 11871 |
ISSN (Print) | 0302-9743 |
Conference
Conference | 20th International Conference on Intelligent Data Engineering and Automated Learning |
---|---|
Abbreviated title | IDEAL 2019 |
Country/Territory | United Kingdom |
City | Manchester |
Period | 14/11/19 → 16/11/19 |
Internet address |