A new online clustering approach for data in arbitrary shaped clusters

Richard Hyde, Plamen Angelov

Research output: Chapter in Book/Report/Conference proceedingConference proceedingpeer-review

15 Citations (Scopus)

Abstract

In this paper we demonstrate a new density based clustering technique, CODSAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE 2nd International Conference on Cybernetics, CYBCONF 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages228-233
Number of pages6
ISBN (Electronic)9781479983223
DOIs
Publication statusPublished - 6 Aug 2015
Externally publishedYes
Event2nd IEEE International Conference on Cybernetics, CYBCONF 2015 - Gdynia, Poland
Duration: 24 Jun 201526 Jun 2015

Conference

Conference2nd IEEE International Conference on Cybernetics, CYBCONF 2015
Country/TerritoryPoland
CityGdynia
Period24/06/1526/06/15

Keywords

  • arbitrary shape clusters
  • big data
  • clustering
  • data streams
  • online clustering

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A new online clustering approach for data in arbitrary shaped clusters'. Together they form a unique fingerprint.

Cite this