TY - GEN
T1 - Towards a faster symbolic aggregate approximation method
AU - Muhammad Fuad, Muhammad Marwan
AU - Marteau, Pierre François
PY - 2010/12/1
Y1 - 2010/12/1
N2 - The similarity search problem is one of the main problems in time series data mining. Traditionally, this problem was tackled by sequentially comparing the given query against all the time series in the database, and returning all the time series that are within a predetermined threshold of that query. But the large size and the high dimensionality of time series databases that are in use nowadays make that scenario inefficient. There are many representation techniques that aim at reducing the dimensionality of time series so that the search can be handled faster at a lower-dimensional space level. The symbolic aggregate approximation (SAX) is one of the most competitive methods in the literature. In this paper we present a new method that improves the performance of SAX by adding to it another exclusion condition that increases the exclusion power. This method is based on using two representations of the time series: one of SAX and the other is based on an optimal approximation of the time series. Pre-computed distances are calculated and stored offline to be used online to exclude a wide range of the search space using two exclusion conditions. We conduct experiments which show that the new method is faster than SAX.
AB - The similarity search problem is one of the main problems in time series data mining. Traditionally, this problem was tackled by sequentially comparing the given query against all the time series in the database, and returning all the time series that are within a predetermined threshold of that query. But the large size and the high dimensionality of time series databases that are in use nowadays make that scenario inefficient. There are many representation techniques that aim at reducing the dimensionality of time series so that the search can be handled faster at a lower-dimensional space level. The symbolic aggregate approximation (SAX) is one of the most competitive methods in the literature. In this paper we present a new method that improves the performance of SAX by adding to it another exclusion condition that increases the exclusion power. This method is based on using two representations of the time series: one of SAX and the other is based on an optimal approximation of the time series. Pre-computed distances are calculated and stored offline to be used online to exclude a wide range of the search space using two exclusion conditions. We conduct experiments which show that the new method is faster than SAX.
KW - Fast SAX
KW - Symbolic aggregate approximation
KW - Time series information retrieval
UR - http://www.scopus.com/inward/record.url?scp=78751548118&partnerID=8YFLogxK
M3 - Conference proceeding
AN - SCOPUS:78751548118
SN - 9789898425225
T3 - ICSOFT 2010 - Proceedings of the 5th International Conference on Software and Data Technologies
SP - 305
EP - 310
BT - ICSOFT 2010 - Proceedings of the 5th International Conference on Software and Data Technologies
PB - IEEE
T2 - 5th International Conference on Software and Data Technologies, ICSOFT 2010
Y2 - 22 July 2010 through 24 July 2010
ER -