Oberseminar Maschinelles Lernen: Time series representation

vendredi, 15. janvier 2016 um 12:00 Uhr

Maschinelles Lernen: Vortrag eines Professors aus der Türkei

Wann? Freitag 15.01.2016 von 12 bis 14 Uhr

Wo? Konferenzraum Raum C 202, Uni-Campus am Samelsonplatz, Hildesheim

Thema des Vortrags: Time series representation and similarity based on local autopatterns”

Referent: Prof. Mustafa Gökçe Baydoğan, Department of Industrial Engineering at Boğaziçi University, Istanbul, Turkey

Info: [in english] Mustafa Gökçe Baydoğan is an assistant professor in Department of Industrial Engineering at Boğaziçi University, Istanbul, Turkey. Before joining Boğaziçi University, he worked as a postdoctoral research assistant at Arizona State University. He received his Ph.D. degree in Industrial Engineering in 2012.

Abstract: Time series data mining has received much greater interest along with the increase in temporal data sets from different domains such as medicine, finance, multimedia, etc. Representations are important to reduce dimensionality and generate useful similarity measures. High-level representations such as Fourier transforms, wavelets, piecewise polynomial models, etc., were considered previously. Recently, autoregressive kernels were introduced to reflect the similarity of the time series. We introduce a novel approach to model the dependency structure in time series that generalizes the concept of autoregression to local auto-patterns. Our approach generates a pattern-based representation along with a similarity measure called Learned Pattern Similarity (LPS). A tree-based ensemble-learning strategy that is fast and insensitive to parameter settings is the basis for the approach. Then, a robust similarity measure based on the learned patterns is presented. This unsupervised approach to represent and measure the similarity between time series generally applies to a number of data mining tasks (e.g., clustering, anomaly detection, classification). Furthermore, an embedded learning of the representation avoids pre-defined features and an extraction step which is common in some feature-based approaches. The method generalizes in a straightforward manner to multivariate time series. The effectiveness of LPS is evaluated on time series classification problems from various domains. We compare LPS to eleven well-known similarity measures. Our experimental results show that LPS provides fast and competitive results on 75 benchmark datasets from several domains. We also test LPS on several multivariate time series datasets from motion capture studies and show that proposed modeling strategy provides promising results. Furthermore, LPS provides a research direction for time series modeling that breaks from the linear dependency models to potentially foster other promising nonlinear approaches.