Abstract
Plagiarism is growing increasingly for the last few years due to the rapid proliferation of information through the World Wide Web (WWW). In this paper, we present an integrated approach based on Latent Semantic Indexing (LSI) and Stylometry technique for intrinsic plagiarism detection. LSI is used for the term document matrix of dataset, whereas, stylometry is used for intrinsic approximation of human writing style. We have conducted a series of experiments to investigate the efficiency of dimensionality reduction (DR) parameter as the core for LSI technique in order to gain insights into its effects using a small corpus. Following that, we carried out comparative evaluation of our approach by using the LSI and Stylometry separately, and then applying them together. Our results show that the performance of the proposed approach was improved when an integrated approach consisting of LSI and stylometry was applied.
Original language | English |
---|---|
Title of host publication | Proceedings - 2013 6th International Conference on Developments in eSystems Engineering, DeSE 2013 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 145-150 |
Number of pages | 6 |
ISBN (Electronic) | 9781479952649 |
DOIs | |
Publication status | Published - 11 Feb 2013 |
Event | 2013 6th International Conference on Developments in eSystems Engineering - Abu Dhabi, United Arab Emirates Duration: 16 Dec 2013 → 18 Dec 2013 |
Conference
Conference | 2013 6th International Conference on Developments in eSystems Engineering |
---|---|
Abbreviated title | DeSE 2013 |
Country/Territory | United Arab Emirates |
City | Abu Dhabi |
Period | 16/12/13 → 18/12/13 |
Keywords
- Extrinsic plagiarism
- Intrinsic plagiarism
- Latent semantic indexing (LSI)
- Plagiarism
- Stylometry technique
- Text misuse
ASJC Scopus subject areas
- Control and Systems Engineering
- Computer Networks and Communications
- Computer Science Applications
- Software