A New Semantic Similarity Scheme for more Accurate Identification in Medical Data

Colin Wilcox, Soufiene Djahel, Vasilios Giagos, Kristopher Welsh, Nicholas Costen

Research output: Chapter in Book/Report/Conference proceedingConference proceedingpeer-review


This paper aims to design a new measure of similarity between personal textual information retrieved from historic medical records to correct errors introduced due to poor encoding and data omission. The key motivation underlying our proposed layered algorithm, named Semantic Similarity scheme (SSIM), is to create a consistent, complete and accurate data set that may then be used as a basis for the identification and authentication of individuals in a medical context. Such consistent data may provide a basis for use as part of an access control system without compromising medical ethics or security. The obtained evaluation results, using four sample data sets from the UK, USA, Canada and Australia, highlight promising benefits compared to other similarity measures including Jaccard index, Sorensen-Dice and Cosine Similarity - especially when nicknames, abbreviations and synonyms are used to determine similarity.
Original languageEnglish
Title of host publicationProceedings of 2023 IEEE International Smart Cities Conference, ISC2 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages7
ISBN (Electronic)9798350397758
ISBN (Print)9798350397758, 9798350397741, 9798350397765
Publication statusPublished - 31 Oct 2023
Externally publishedYes
Event IEEE International Smart Cities Conference (ISC2) - Bucharest, Romania
Duration: 24 Sept 202327 Sept 2023

Publication series

NameProceedings of the IEEE International Smart Cities Conference
ISSN (Print)2687-8852
ISSN (Electronic)2687-8860


Conference IEEE International Smart Cities Conference (ISC2)


  • Medical Data
  • Medical Records
  • Semantic Similarity
  • Text Similarity

ASJC Scopus subject areas

  • Artificial Intelligence
  • Communication
  • Energy Engineering and Power Technology
  • Computer Vision and Pattern Recognition
  • Computer Science Applications
  • Renewable Energy, Sustainability and the Environment
  • Urban Studies


Dive into the research topics of 'A New Semantic Similarity Scheme for more Accurate Identification in Medical Data'. Together they form a unique fingerprint.

Cite this