Modeling the selection of time series haracteristics for machine learning methods based on the correlation matrix
Аuthors
*, **,Voronezh State Technical University, VSTU, 14, Moskovsky prospect, Voronezh, 394026, Russia
*e-mail: pgusev@cchgeu.ru
**e-mail: tavolzhanskij.a@yandex.ru
Abstract
This paper considers the problem of selecting the necessary set of characteristics for time series forecasting. The method of solving this problem on the basis of correlation matrix is proposed. A correlation matrix is constructed using the prepared data, after which a list is formed for each variable, ordered by decreasing modulus of the correlation degree. Then linear regression models are trained and the quality of predictions for different sets of variables from the sorted list is compared. Next, the comparison is performed for different values of sampling time to determine the optimal value for each variable. To apply the considered algorithm, information from a real object - an autoclave plant for the production of composite aircraft components - was used. The use of autoclave plants for the production of aircraft parts represents a crucial stage in the manufacturing process of modern aircraft. The application of predictive analysis in the management of these facilities can facilitate the optimization of production processes and the enhancement of product quality. This paper proposes a method for identifying a set of characteristics based on a correlation matrix. The application of this method allows the most significant characteristics for predicting the condition of an autoclave plant for the production of aircraft parts to be identified, and those that do not affect the accuracy of the prediction to be excluded. In addition to the theoretical justification, the paper investigates the prediction accuracy depending on the number of variables involved in the prediction, as well as the sampling time, taking into account the different frequency of obtaining measurements.
References
- Kovalev V.Z., Shvetsov S.Yu., Arkhipova O.V. Inzhenernyi vestnik Dona, 2023, no. 4 (100), pp. 127-141.
- Neizvestnyi O.G. Vestnik Voronezhskogo gosudarstvennogo tekhnicheskogo universiteta, 2023, vol. 19, no. 6, pp. 32-40. DOI: 10.36622/VSTU.2023.19.6.005
- Shehadeh A., Alshboul O., Al Mamlook R. E., Hamedat O. Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, LightGBM, and XGBoost regression, Automation in Construction, 2021, vol. 129, no. 2, pp. 103827. DOI: 10.1016/j.autcon.2021.103827
- Semakov S.L., Semakov I.S. Trudy MAI, 2018, no. 100. URL: https://trudymai.ru/eng/published.php?ID=93446
- Sedykh I.A., Strugov I.V. Vestnik Voronezhskogo gosudarstvennogo tekhnicheskogo universiteta, 2023, vol. 19, no. 2, pp. 72-78.
- Chernikov A.A. Trudy MAI, 2023, no. 129. URL: https://trudymai.ru/eng/published.php?ID=173039. DOI: 10.34759/trd-2023-129-26
- Chigrinets E.G., Verchenko A.V. Trudy MAI, 2019, no. 104. URL: https://trudymai.ru/eng/published.php?ID=102420
- Manoharan A., Begam K.M., Aparow V.R., Sooriamoorthy D. Artificial Neural Networks, Gradient Boosting and Support Vector Machines for electric vehicle battery state estimation: A review, Journal of Energy Storage, 2022, no. 55, pp. 105384. DOI: 10.1016/j.est.2022.105384
- James G., Witten D., Hastie T., Tibshirani R., Taylor J. An introduction to statistical learning: With applications in python, Springer Nature, 2023, 607 p.
- Rumyantsev N.V., Solov'ev S.V., Pavlov D.V. Trudy MAI, 2024, no. 136. URL: https://trudymai.ru/eng/published.php?ID=180688
- Chizhov M.I., Skripchenko Yu.S., Gusev P.Yu. Komp'yuternye issledovaniya i modelirovanie, 2014, vol. 6, no. 2, pp. 245-252.
- Gusev P.Yu., Skripchenko Yu.S., Lysov D.V. Izvestiya Samarskogo nauchnogo tsentra Rossiiskoi akademii nauk, 2016, vol. 18, no. 4-3, pp. 432-438.
- Gusev P.Yu. Trudy MAI, 2018, no. 103. URL: https://trudymai.ru/eng/published.php?ID=101190
- Ramesh T.R., Lilhore U.K., Poongodi M., Simaiya S., Kaur A., Hamdi M. Predictive analysis of heart diseases with machine learning approaches, Malaysian Journal of Computer Science, 2022, no. 1, pp. 132-148. DOI: 10.22452/mjcs.sp2022no1.10
- Tavolzhanskii A.V., Gusev P.Yu. 12-ya Mezhdunarodnaya molodezhnaya nauchnaya konferentsiya «Pokolenie budushchego: vzglyad molodykh uchenykh-2023»: sbornik statei. Kursk, Universitetskaya kniga, 2023, vol. 3, pp. 96-99.
- Gusev P.Yu., Tavolzhanskii A.V. Vestnik Voronezhskogo gosudarstvennogo tekhnicheskogo universiteta, 2024, vol. 19, no, 2, pp. 14-19.
- Gusev P.Yu., Tavolzhanskij A.V. Techniques and algorithms for predictive analysis, Mezhdunarodnyi forum professional'nogo obrazovaniya «Antropotsentricheskie nauki v obrazovanii: vyzovy, transformatsii, resursy» sbornik statei. Voronezh, Nauchnaya kniga, 2024, pp. 396-400.
- Qi J., Du J., Siniscalchi S.M., Ma X., Lee C.H. On mean absolute error for deep neural network based vector-to-vector regression, IEEE Signal Processing Letters, 2020, no. 27, pp. 1485-1489. DOI: 10.1109/LSP.2020.3016837
- Rizvanova E.R. Izvestiya Sankt-Peterburgskogo gosudarstvennogo ekonomicheskogo universiteta, 2017, no. 6 (108), pp. 159-163.
- Egoshin V.L., Ivanov S.V., Savvina N.V., Ermolaev A.R., Mamyrbekova S.A., Zhamalieva L.M., Grjibovskij A.M. Ekologiya cheloveka, 2018, no. 12, pp. 55-64.
Download