A Statistical and Machine Learning Analysis of the Significant Features of PPPoE Sessions for Quality Monitoring
Received: 14 June 2025 | Revised: 13 July 2025 | Accepted: 16 July 2025 | Online: 9 August 2025
Corresponding author: Yelizaveta Vitulyova
Abstract
The present work explores the development and application of the method of indirect monitoring of telecommunication network quality based on the analysis of Point-to-Point Protocol over Ethernet (PPPoE) session parameters using machine learning methods as a key indicator of the network failures, the use of the K coefficient is justified based on the dynamics of PPPoE Active Discovery Termination (PADT) packets and the number of active PPPoE sessions. The paper describes the stages of data collection and preprocessing, including the conversion of session indicators from a “wide” format to a “long” format for ease of analysis. A statistical analysis of the significance of attributes (Analysis of Variance (ANOVA)-test, correlation analysis) was carried out, based on which a limited set of informative parameters of PPPoE-sessions (e.g., connection duration, frequency of disconnections, volume of transmitted data, connection establishment time) was selected. Linear Regression, Ridge, Lasso, Random Forest, and Support Vector Regression (SVR) models were trained and comparatively evaluated on these attributes to predict the K value. The symbolic regression experiment provided an analytical formula to confirm the correctness of the selected K value. The comparative analysis by the Mean Squared Error (MSE) and Coefficient of Determination (R²) metrics showed the advantage of Random Forest model (R2 ≈ 0.90, MSE ≈ 0.0001), which indicates the high efficiency of the proposed approach. The significance of the study lies in demonstrating the possibility of the early detection of the network quality anomalies without a direct analysis of the traffic content, which increases the efficiency of monitoring the quality of telecommunication services.
Keywords:
machine learning, PPPoE, Quality of Service (QoS), statistical analysis, network monitoring, broadband networksDownloads
References
H. Ren et al., "Time-Series Anomaly Detection Service at Microsoft," in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, Jul. 2019, pp. 3009–3017.
R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman, "Deep Learning Approach for Intelligent Intrusion Detection System," IEEE Access, vol. 7, pp. 41525–41550, 2019.
L. Mamakos, K. Lidl, J. Evarts, D. Carrel, D. Simone, and R. Wheeler, "A Method for Transmitting PPP Over Ethernet (PPPoE)," RFC Editor, RFC2516, Feb. 1999.
T. Babenko, H. Hnatiienko, and V. Vialkova, "Modeling of the Integrated Quality Assessment System of the Information Security Management System," in CEUR Workshop Proceedings, 2021, vol. 2845, pp. 75–84.
J. Alkenani and K. Nassar, "Network Monitoring Measurements for Quality of Service: A Review," Iraqi Journal for Electrical and Electronic Engineering, vol. 18, no. 2, pp. 33–42, Dec. 2022.
Z. Ayan, B. Alimjan, M. Olga, Z. Timur, and Z. Toktalyk, "Quality of service management in telecommunication network using machine learning technique," Indonesian Journal of Electrical Engineering and Computer Science, vol. 32, no. 2, Nov. 2023, Art. no. 1022.
L. Chen and M. Zhao, "Machine Learning Techniques in QoS Management for PPPoE Networks," Journal of Advanced Networking, vol. 15, no. 2, pp. 45–56, Feb. 2021.
P. Schummer, A. Del Rio, J. Serrano, D. Jimenez, G. Sánchez, and Á. Llorente, "Machine Learning-Based Network Anomaly Detection: Design, Implementation, and Evaluation," AI, vol. 5, no. 4, pp. 2967–2983, Dec. 2024.
M. Yakubova, O. Manankova, A. Mukasheva, A. Baikenov, and T. Serikov, "The Development of a Secure Internet Protocol (IP) Network Based on Asterisk Private Branch Exchange (PBX)," Applied Sciences, vol. 13, no. 19, Sep. 2023, Art. no. 10712.
A. Zhunussov, A. S. Baikenov, and D. Ilieva, "Monitoring the quality of services provided in a telecommunication network by analyzing the statistics of PPPoE packets," in 2020 7th International Conference on Energy Efficiency and Agricultural Engineering (EE&AE), Ruse, Bulgaria, Nov. 2020, pp. 1–4.
Y. Gujarathi and Y. Potekar, "Machine Learning in Network Traffic Analysis: Classification, Optimization, and Security," International Journal for Research in Applied Science and Engineering Technology, vol. 13, no. 4, pp. 455–459, Apr. 2025.
G. Sadikova, M. Amreev, O. Manankova, A. Mukasheva, and T. Serikov, "Analysis and Research of Tasks for Optimizing Flows in Multiservice Networks Based on the Principles of a Systems Approach," Journal of Theoretical and Applied Information Technology, vol. 100, no. 9, pp. 2811–2825, 2022.
S. Velednitsky, "The Future of Network Monitoring: How AI and Machine Learning Are Changing the Game," Security, Feb. 2025.
J. Sanusi, S. Adeshina, A. M. Aibinu, O. Oshiga, R. Prasad, and A. Dayyabu, "Mobility Prediction Algorithms for Handover Management in Heterogeneous LiFi and RF Networks: An Ensemble Approach," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18300–18306, Dec. 2024.
Z. Wang et al., "Failure prediction using machine learning and time series in optical network," Optics Express, vol. 25, no. 16, Aug. 2017, Art. no. 18553.
S. Schmidl, P. Wenig, and T. Papenbrock, "Anomaly detection in time series: a comprehensive evaluation," Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1779–1797, May 2022.
P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, "Long Short Term Memory Networks for Anomaly Detection in Time Series," in the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, Arp 2015, pp. 89–94.
L. Bounia and I. Setitra, "Computing Improved Explanations for Random Forests: k-Majoritary Reasons:," in Proceedings of the 17th International Conference on Agents and Artificial Intelligence, Porto, Portugal, 2025, pp. 188–198.
M. Z. Yakubova, O. A. Manankova, K. A. Tashev, and G. S. Sadikova, "Methodology of the Determining for Pearson’s Criterion based on Researching the Value of Delays in the Transmitting of Information over a Multiservice Network," in 2020 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, Nov. 2020, pp. 1–5.
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2009.
P. Wyrwiński and K. Krawiec, "Learning Semantics-aware Search Operators for Genetic Programming." arXiv, 2025.
J. Brownlee, "How to Perform Feature Selection With Numerical Input Data," MachineLearningMastery.com, Jun. 04, 2020. https://www.machinelearningmastery.com/feature-selection-with-numerical-input-data/.
V. N. Vapnik, "Complete Statistical Theory of Learning," Automation and Remote Control, vol. 80, no. 11, pp. 1949–1975, Nov. 2019.
M. Ali, I. Ullah, W. Noor, A. Sajid, A. Basit, and J. Baber, "Predicting the Session of an P2P IPTV User through Support Vector Regression (SVR)," Engineering, Technology & Applied Science Research, vol. 10, no. 4, pp. 6021–6026, Aug. 2020.
Cisco Systems, PPPoE Subscriber Management Configuration Guide. San Jose, California, United States: Cisco, 2021.
K. Venkatachalam, P. Prabhu, B. S. Balaji, M. Abouhawwash, and R. Rajadevi, "Recursive Feature Elimination with Ridge Regression (L2) Machine Learning Hybrid Feature Selection Algorithm for Diabetic Prediction using Random Forest Classifer." In Review, Jul. 23, 2021.
A. C. Cardall, R. C. Hales, K. B. Tanner, G. P. Williams, and K. N. Markert, "LASSO (L1) Regularization for Development of Sparse Remote-Sensing Models with Applications in Optically Complex Waters Using GEE Tools," Remote Sensing, vol. 15, no. 6, Mar. 2023, Art. no. 1670.
S. Obata, C. J. Cieszewski, R. C. Lowe, and P. Bettinger, "Random Forest Regression Model for Estimation of the Growing Stock Volumes in Georgia, USA, Using Dense Landsat Time Series and FIA Dataset," Remote Sensing, vol. 13, no. 2, Jan. 2021, Art. no. 218.
Y. Kovalova, T. Babenko, O. Oksiiuk, and L. Myrutenko, "Optimization of Lifetime in Wireless Monitoring Networks," International Journal of Computing, vol. 19, no. 2, pp. 267–272, Jun. 2020.
M. Quade, M. Abel, K. Shafi, R. K. Niven, and B. R. Noack, "Prediction of dynamical systems by symbolic regression," Physical Review E, vol. 94, no. 1, Jul. 2016, Art. no. 012214.
J. Gao, "R-Squared (R2) – How Much Variation Is Explained?," Research Methods in Medicine & Health Sciences, vol. 5, no. 4, pp. 104–109, Sep. 2024.
E. S. Vitulyova, D. K. Matrassulova, and I. E. Suleimenov, "Construction of Generalized Rademacher Functions in Terms of Ternary Logic: Solving the Problem of Visibility of Using Galois Fields for Digital Signal Processing," International Journal of Electronics and Telecommunications, vol. 68, no. 2, pp. 237–244, Dec. 2021.
I. E. Suleimenov, Y. S. Vitulyova, and D. K. Matrassulova, "Features of digital signal processing algorithms using Galois fields GF(2n+1)," PLOS ONE, vol. 18, no. 10, Oct. 2023, Art. no. e0293294.
Downloads
How to Cite
License
Copyright (c) 2025 Ayan Zhunussov, Alimzhan Baikenov, Tansaule Serikov, Olga Abramkina, Yelizaveta Vitulyova

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.