Method of Harmonics Parameters Identification and Anomalies of a Periodic Time Series Based on Adaptive Decomposition
DOI:
https://doi.org/10.31649/1997-9266-2023-171-6-46-56Keywords:
time series analysis, simulation, machine learning, time series anomalies, seasonality, Fourier series harmonics, air quality, EcoCityAbstract
Periodic time series have many applications — financial indicators, indicators of air quality, indicators of the state of water, etc. Accordingly, simulation of time series and pattern analysis are relevant and quite common tasks for understanding possible trends and changes for correct and timely actions. Important parameters of periodic time series are their trends, seasonal components, and anomalies. There exist numerous methods to determine the trend of a time series, but when it comes to the simultaneous identification of parameters of various types of seasonality and anomalies of different nature in different periods, this task is not trivial and there is no universal solution for this problem. Most of the solutions are specific to a specific subject area or demonstrate insufficient adequacy and accuracy of approximation.
New method of identifying parameters of harmonics and anomalies of a periodic time series, based on the adaptive decomposition of the series, has been developed. It is proposed to decompose a given time series with a period up to half of the total number of time series records and to plot the ratio of the amplitudes of the seasonal component to the amplitudes of the series itself — the so-called “decomposition curve”. Then, smooth this curve and find local maxima, which are proposed to be considered as corresponding to the period of possible types of seasonality of the series. Considering many years of experience using the Facebook Prophet model, a set of relations between values of the seasonality period, the order of the Fourier series for its approximation, and the degree of regularization that should be taken into account are proposed. For each type of seasonality in each period, one of the known methods should be used to find anomalous data and check their statistical significance. Statistically significant anomalies are collected in a combined set with typical parameters. A few possible variants of the structures of such time series models are proposed. The algorithm of the method is developed, and its main components are described.
The offered method was tested in Python in the notebook of the Kaggle platform. This notebook uses the Facebook Prophet model on real data of air quality observations obtained from one of the EcoCity public monitoring network stations within the international program “Clean Air for Ukraine”. Tests showed that compared to the model with default parameters and default parameters of seasonality, the optimal model of the proposed method improved the accuracy of the approximation for the R2 metric — by 1,7 times, and for the MSE metric — by 2 times. This confirms the effectiveness of the offered method.
References
Robert Shumway, and David Stoffer, Time Series Analysis and Its Applications With R Examples, 2011 https://doi.org/10.1007/978-1-4419-7865-3 .
Б. І. Мокін, О. Б. Мокін, і В. Б. Мокін, Методологія та організація наукових досліджень, підруч., 3-е вид., змін. та доп. Вінниця: ВНТУ, 2023, 230 с.
Terence C. Mills, ARMA Models for Stationary Time Series, Chapter 3, Terence C. Mills, Ed., Applied Time Series Analysis, Academic Press, 2019, pp. 31-56, ISBN 9780128131176. https://doi.org/10.1016/B978-0-12-813117-6.00003-X .
Sean J Taylor, and Benjamin Letham, “Forecasting at scale,” Peer J. Preprints, 5, 2017. https://doi.org/10.7287/peerj.preprints.3190v2 .
В. Б. Мокін, О. В. Слободянюк, О. М. Давидюк, і Д. О. Шмундяк, «Інформаційна технологія пошуку можливих джерел підвищеного забруднення річки з використанням моделі Prophet,» Вісник Вінницького політехнічного інституту, № 4, с. 15-24, 2020. https://doi.org/10.31649/1997-9266-2020-151-4-15-24 .
А. В. Лосенко, «Інформаційна технологія прогнозування часового ряду кількості хворих на коронавірус на основі моделі Facebook Prophet,» Вісник Вінницького політехнічного інституту, № 5, с. 50-59, 2023. https://doi.org/10.31649/1997-9266-2023-170-5-50-59 .
В. Б. Мокін, А. В. Лосенко, і А. Р. Ящолт, «Інформаційна технологія аналізу та прогнозування кількості нових випадків хвороби на коронавірус SARS-COV-2 в Україні на основі моделі Prophet», Вісник Вінницького політехнічного інституту, № 5, с. 71-83, 2020. https://doi.org/10.31649/1997-9266-2020-152-5-71-83 .
В. Б. Мокін, А. В. Лосенко, і А. Р. Ящолт, «Інформаційна технологія аналізу та прогнозування багатохвильової кількості нових випадків захворювань на коронавірус COVID-19 на основі моделі Prophet», Вісник Вінницького політехнічного інституту, № 6, с. 65-75, 2020. https://doi.org/10.31649/1997-9266-2020-153-6-65-75 .
Dmytro Shmundiak, and Vitalii Mokin, “Adaptive decomposition for harmonics and anomalies,” Kaggle Notebook. [Electronic resource]. Available: https://www.kaggle.com/code/dimashmundiak/adaptive-decomposition-for-harmonics-and-anomalies. Accessed: 20.12.2023.
Vitalii Mokin, and Arsen Losenko, “COVID-19 Ukraine daily cases – EDA,” Kaggle Notebook. [Electronic resource]. Available: https://www.kaggle.com/code/vbmokin/covid-19-ukraine-daily-cases-eda . Accessed: 12.10.2023.
V. Aggarwal, V. Gupta, P. Singh, K. Sharma, and N. Sharma, “Detection of Spatial Outlier by Using Improved Z-Score Test,” 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 2019, pp. 788-790. https://doi.org/10.1109/ICOEI.2019.8862582 .
H. P. Vinutha, B. Poornima, and B. M. Sagar, “Detection of Outliers Using Interquartile Range Technique from Intrusion Dataset,” S. Satapathy, J. Tavares, V. Bhateja, J. Mohanty, Ed. Information and Decision Sciences. Advances in Intelligent Systems and Computing, vol. 701, 2018, Springer, Singapore. https://doi.org/10.1007/978-981-10-7563-6_53 .
Julien Lesouple, Cédric Baudoin, Marc Spigai, and Jean-Yves Tourneret, “Generalized isolation forest for anomaly detection,” Pattern Recognition Letters, vol. 149, pp. 109-119, 2021. ISSN 0167-8655. https://doi.org/10.1016/j.patrec.2021.05.022 .
Salima Omar, Md Ngadi, Hamid Jebur, and Salima Benqdara, “Machine Learning Techniques for Anomaly Detection: An Overview,” International Journal of Computer Applications, vol. 79, no. 2, 2013. https://doi.org/10.5120/13715-1478 .
В. Б. Мокін, і А. В. Лосенко, «Інформаційна технологія короткострокового прогнозування кількості нових хворих на коронавірус на основі моделі Facebook Prophet. Інформаційно-комунікаційні технології для перемоги та відновлення,» у Колективна монографія за матеріалами ХХII Міжнародної науково-практичної конференції «Інформаційно-комунікаційні технології та сталий розвиток» (Київ, 14-15 листопада 2023 р.), С. О. Довгий. Ред. Київ, Україна: ТОВ «Видавництво «Юстон», 2023, с. 27-30.
R. F. Woolson, Wilcoxon Signed-Rank Test. In Wiley Encyclopedia of Clinical Trials, R. B. D'Agostino, L. Sullivan and J. Massaro, Ed., 2008. https://doi.org/10.1002/9780471462422.eoct979 .
P. E. McKnight, and J. Najab, “Mann-Whitney U Test,” In The Corsini Encyclopedia of Psychology , I. B. Weiner and W. E. Craighead, Ed., 2010. https://doi.org/10.1002/9780470479216.corpsy0524 .
Sklearn. API Reference. [Electronic resource]. Available: https://scikit-learn.org/stable/modules/classes.html . Accessed: 07.12.2023.
Downloads
-
pdf (Українська)
Downloads: 112
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).