A fast and efficient semantic short text similarity metric

David Croft, Simon Coupland, Jethro Shell, Stephen Brown

Research output: Chapter in Book/Report/Conference proceedingConference proceeding

14 Citations (Scopus)

Abstract

The semantic comparison of short sections of text is an emerging aspect of Natural Language Processing (NLP). In this paper we present a novel Short Text Semantic Similarity (STSS) method, Lightweight Semantic Similarity (LSS), to address the issues that arise with sparse text representation. The proposed approach captures the semantic information contained when comparing text to process the similarity. The methodology combines semantic term similarities with a vector similarity method used within statistical analysis. A modification of the term vectors using synset similarity values addresses issues that are encountered with sparse text. LSS is shown to be comparable to current semantic similarity approaches, LSA and STASIS, whilst having a lower computational footprint.
Original languageEnglish
Title of host publication13th UK Workshop on Computational Intelligence (UKCI), 2013
PublisherIEEE
Pages221-227
Number of pages7
ISBN (Print)978­1­4799­1568­2
DOIs
Publication statusPublished - 2013
Event13th UK Workshop on Computational Intelligence (UKCI) 2013 - University of Surrey, Guildford, United Kingdom
Duration: 9 Sep 201311 Sep 2013
Conference number: 13
http://ukci2013.cs.surrey.ac.uk/

Workshop

Workshop13th UK Workshop on Computational Intelligence (UKCI) 2013
Abbreviated titleUKCI 2013
CountryUnited Kingdom
CityGuildford
Period9/09/1311/09/13
Internet address

Fingerprint

Semantics
Statistical methods
Processing

Keywords

  • Vectors
  • Semantics
  • Measurment
  • Natural language processing
  • Educational institutions
  • Media
  • Electronic mail

Cite this

Croft, D., Coupland, S., Shell, J., & Brown, S. (2013). A fast and efficient semantic short text similarity metric. In 13th UK Workshop on Computational Intelligence (UKCI), 2013 (pp. 221-227). IEEE. https://doi.org/10.1109/UKCI.2013.6651309

A fast and efficient semantic short text similarity metric. / Croft, David; Coupland, Simon; Shell, Jethro; Brown, Stephen.

13th UK Workshop on Computational Intelligence (UKCI), 2013 . IEEE, 2013. p. 221-227.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding

Croft, D, Coupland, S, Shell, J & Brown, S 2013, A fast and efficient semantic short text similarity metric. in 13th UK Workshop on Computational Intelligence (UKCI), 2013 . IEEE, pp. 221-227, 13th UK Workshop on Computational Intelligence (UKCI) 2013, Guildford, United Kingdom, 9/09/13. https://doi.org/10.1109/UKCI.2013.6651309
Croft D, Coupland S, Shell J, Brown S. A fast and efficient semantic short text similarity metric. In 13th UK Workshop on Computational Intelligence (UKCI), 2013 . IEEE. 2013. p. 221-227 https://doi.org/10.1109/UKCI.2013.6651309
Croft, David ; Coupland, Simon ; Shell, Jethro ; Brown, Stephen. / A fast and efficient semantic short text similarity metric. 13th UK Workshop on Computational Intelligence (UKCI), 2013 . IEEE, 2013. pp. 221-227
@inproceedings{118a61c72e3f4233b10575211afa74c2,
title = "A fast and efficient semantic short text similarity metric",
abstract = "The semantic comparison of short sections of text is an emerging aspect of Natural Language Processing (NLP). In this paper we present a novel Short Text Semantic Similarity (STSS) method, Lightweight Semantic Similarity (LSS), to address the issues that arise with sparse text representation. The proposed approach captures the semantic information contained when comparing text to process the similarity. The methodology combines semantic term similarities with a vector similarity method used within statistical analysis. A modification of the term vectors using synset similarity values addresses issues that are encountered with sparse text. LSS is shown to be comparable to current semantic similarity approaches, LSA and STASIS, whilst having a lower computational footprint.",
keywords = "Vectors, Semantics, Measurment, Natural language processing, Educational institutions, Media, Electronic mail",
author = "David Croft and Simon Coupland and Jethro Shell and Stephen Brown",
year = "2013",
doi = "10.1109/UKCI.2013.6651309",
language = "English",
isbn = "978­1­4799­1568­2",
pages = "221--227",
booktitle = "13th UK Workshop on Computational Intelligence (UKCI), 2013",
publisher = "IEEE",
address = "United States",

}

TY - GEN

T1 - A fast and efficient semantic short text similarity metric

AU - Croft, David

AU - Coupland, Simon

AU - Shell, Jethro

AU - Brown, Stephen

PY - 2013

Y1 - 2013

N2 - The semantic comparison of short sections of text is an emerging aspect of Natural Language Processing (NLP). In this paper we present a novel Short Text Semantic Similarity (STSS) method, Lightweight Semantic Similarity (LSS), to address the issues that arise with sparse text representation. The proposed approach captures the semantic information contained when comparing text to process the similarity. The methodology combines semantic term similarities with a vector similarity method used within statistical analysis. A modification of the term vectors using synset similarity values addresses issues that are encountered with sparse text. LSS is shown to be comparable to current semantic similarity approaches, LSA and STASIS, whilst having a lower computational footprint.

AB - The semantic comparison of short sections of text is an emerging aspect of Natural Language Processing (NLP). In this paper we present a novel Short Text Semantic Similarity (STSS) method, Lightweight Semantic Similarity (LSS), to address the issues that arise with sparse text representation. The proposed approach captures the semantic information contained when comparing text to process the similarity. The methodology combines semantic term similarities with a vector similarity method used within statistical analysis. A modification of the term vectors using synset similarity values addresses issues that are encountered with sparse text. LSS is shown to be comparable to current semantic similarity approaches, LSA and STASIS, whilst having a lower computational footprint.

KW - Vectors

KW - Semantics

KW - Measurment

KW - Natural language processing

KW - Educational institutions

KW - Media

KW - Electronic mail

U2 - 10.1109/UKCI.2013.6651309

DO - 10.1109/UKCI.2013.6651309

M3 - Conference proceeding

SN - 978­1­4799­1568­2

SP - 221

EP - 227

BT - 13th UK Workshop on Computational Intelligence (UKCI), 2013

PB - IEEE

ER -