Abstract
In this paper we demonstrate a new density based clustering technique, CODSAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.
Original language | English |
---|---|
Title of host publication | Proceedings - 2015 IEEE 2nd International Conference on Cybernetics, CYBCONF 2015 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 228-233 |
Number of pages | 6 |
ISBN (Electronic) | 9781479983223 |
DOIs | |
Publication status | Published - 6 Aug 2015 |
Externally published | Yes |
Event | 2nd IEEE International Conference on Cybernetics, CYBCONF 2015 - Gdynia, Poland Duration: 24 Jun 2015 → 26 Jun 2015 |
Conference
Conference | 2nd IEEE International Conference on Cybernetics, CYBCONF 2015 |
---|---|
Country/Territory | Poland |
City | Gdynia |
Period | 24/06/15 → 26/06/15 |
Keywords
- arbitrary shape clusters
- big data
- clustering
- data streams
- online clustering
ASJC Scopus subject areas
- Artificial Intelligence
- Computational Theory and Mathematics
- Electrical and Electronic Engineering