TY - GEN
T1 - Optimized word-size time series representation method using a genetic algorithm with a flexible encoding scheme
AU - Muhammad Fuad, Muhammad Marwan
PY - 2016/11/5
Y1 - 2016/11/5
N2 - Performing time series mining tasks directly on raw data is inefficient, therefore these data require representation methods that transform them into low-dimension spaces where they can be managed more efficiently. Owing to its simplicity, the piecewise aggregate approximation is a popular time series representation method. But this method uses a uniform word-size for all the segments in the time series, which reduces the quality of the representation. Although some alternatives use representations with different word-sizes in a way that reflects the various information contents of different segments, such methods apply a complicated representation scheme, as it uses a different representation for each time series in the dataset. In this paper we present two modifications of the original piecewise aggregate approximation. The novelty of these modifications is that they use different word-sizes, which allows for a flexible representation that reflects the level of activity in each segment, yet these new medications address this problem on a dataset-level, which simplifies establishing a lower bounding distance. The word-sizes are determined through an optimization process. The experiments we conducted on a variety of time series datasets validate the two new modifications.
AB - Performing time series mining tasks directly on raw data is inefficient, therefore these data require representation methods that transform them into low-dimension spaces where they can be managed more efficiently. Owing to its simplicity, the piecewise aggregate approximation is a popular time series representation method. But this method uses a uniform word-size for all the segments in the time series, which reduces the quality of the representation. Although some alternatives use representations with different word-sizes in a way that reflects the various information contents of different segments, such methods apply a complicated representation scheme, as it uses a different representation for each time series in the dataset. In this paper we present two modifications of the original piecewise aggregate approximation. The novelty of these modifications is that they use different word-sizes, which allows for a flexible representation that reflects the level of activity in each segment, yet these new medications address this problem on a dataset-level, which simplifies establishing a lower bounding distance. The word-sizes are determined through an optimization process. The experiments we conducted on a variety of time series datasets validate the two new modifications.
KW - Genetic algorithm
KW - Time series representation
KW - Word-size
UR - http://www.scopus.com/inward/record.url?scp=85006008117&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-49130-1_3
DO - 10.1007/978-3-319-49130-1_3
M3 - Conference proceeding
AN - SCOPUS:85006008117
SN - 9783319491295
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 26
EP - 34
BT - AIIA 2016
A2 - Adorni, Giovanni
A2 - Maratea, Marco
A2 - Cagnoni, Stefano
A2 - Gori, Marco
PB - Springer-Verlag Italia
T2 - 15th International Conference on Italian Association for Artificial Intelligence, AIIA 2016
Y2 - 28 November 2016 through 1 December 2016
ER -