A Statistical and Machine Learning Analysis of the Significant Features of PPPoE Sessions for Quality Monitoring

Authors

  • Ayan Zhunussov Department of Telecommunication Engineering, Almaty University of Power Engineering and Telecommunications, Almaty, Kazakhstan
  • Alimzhan Baikenov Department of Telecommunication Engineering, Almaty University of Power Engineering and Telecommunications, Almaty, Kazakhstan
  • Tansaule Serikov Department of Electronics and Telecommunication, S. Seifullin Kazakh Agro Technical Research University, Astana, Kazakhstan
  • Olga Abramkina Department of Cybersecurity, Almaty University of Power Engineering and Telecommunications, Almaty, Kazakhstan | Department of Cybersecurity, International Information Technology University, Almaty, Kazakhstan
  • Yelizaveta Vitulyova National Scientific Laboratory for the Collective Use of Information and Space Technologies (NSLC IST), Satbayev University, Almaty, Kazakhstan | JSC "Institute of Digital Engineering and Technology," Almaty, Kazakhstan
Volume: 15 | Issue: 5 | Pages: 26923-26934 | October 2025 | https://doi.org/10.48084/etasr.12714

Abstract

The present work explores the development and application of the method of indirect monitoring of telecommunication network quality based on the analysis of Point-to-Point Protocol over Ethernet (PPPoE) session parameters using machine learning methods as a key indicator of the network failures, the use of the K coefficient is justified based on the dynamics of PPPoE Active Discovery Termination (PADT) packets and the number of active PPPoE sessions. The paper describes the stages of data collection and preprocessing, including the conversion of session indicators from a “wide” format to a “long” format for ease of analysis. A statistical analysis of the significance of attributes (Analysis of Variance (ANOVA)-test, correlation analysis) was carried out, based on which a limited set of informative parameters of PPPoE-sessions (e.g., connection duration, frequency of disconnections, volume of transmitted data, connection establishment time) was selected. Linear Regression, Ridge, Lasso, Random Forest, and Support Vector Regression (SVR) models were trained and comparatively evaluated on these attributes to predict the K value. The symbolic regression experiment provided an analytical formula to confirm the correctness of the selected K value. The comparative analysis by the Mean Squared Error (MSE) and Coefficient of Determination () metrics showed the advantage of Random Forest model (R2 ≈ 0.90, MSE ≈ 0.0001), which indicates the high efficiency of the proposed approach. The significance of the study lies in demonstrating the possibility of the early detection of the network quality anomalies without a direct analysis of the traffic content, which increases the efficiency of monitoring the quality of telecommunication services.

Keywords:

machine learning, PPPoE, Quality of Service (QoS), statistical analysis, network monitoring, broadband networks

Downloads

Download data is not yet available.

References

H. Ren et al., "Time-Series Anomaly Detection Service at Microsoft," in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, Jul. 2019, pp. 3009–3017.

R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman, "Deep Learning Approach for Intelligent Intrusion Detection System," IEEE Access, vol. 7, pp. 41525–41550, 2019.

L. Mamakos, K. Lidl, J. Evarts, D. Carrel, D. Simone, and R. Wheeler, "A Method for Transmitting PPP Over Ethernet (PPPoE)," RFC Editor, RFC2516, Feb. 1999.

T. Babenko, H. Hnatiienko, and V. Vialkova, "Modeling of the Integrated Quality Assessment System of the Information Security Management System," in CEUR Workshop Proceedings, 2021, vol. 2845, pp. 75–84.

J. Alkenani and K. Nassar, "Network Monitoring Measurements for Quality of Service: A Review," Iraqi Journal for Electrical and Electronic Engineering, vol. 18, no. 2, pp. 33–42, Dec. 2022.

Z. Ayan, B. Alimjan, M. Olga, Z. Timur, and Z. Toktalyk, "Quality of service management in telecommunication network using machine learning technique," Indonesian Journal of Electrical Engineering and Computer Science, vol. 32, no. 2, Nov. 2023, Art. no. 1022.

L. Chen and M. Zhao, "Machine Learning Techniques in QoS Management for PPPoE Networks," Journal of Advanced Networking, vol. 15, no. 2, pp. 45–56, Feb. 2021.

P. Schummer, A. Del Rio, J. Serrano, D. Jimenez, G. Sánchez, and Á. Llorente, "Machine Learning-Based Network Anomaly Detection: Design, Implementation, and Evaluation," AI, vol. 5, no. 4, pp. 2967–2983, Dec. 2024.

M. Yakubova, O. Manankova, A. Mukasheva, A. Baikenov, and T. Serikov, "The Development of a Secure Internet Protocol (IP) Network Based on Asterisk Private Branch Exchange (PBX)," Applied Sciences, vol. 13, no. 19, Sep. 2023, Art. no. 10712.

A. Zhunussov, A. S. Baikenov, and D. Ilieva, "Monitoring the quality of services provided in a telecommunication network by analyzing the statistics of PPPoE packets," in 2020 7th International Conference on Energy Efficiency and Agricultural Engineering (EE&AE), Ruse, Bulgaria, Nov. 2020, pp. 1–4.

Y. Gujarathi and Y. Potekar, "Machine Learning in Network Traffic Analysis: Classification, Optimization, and Security," International Journal for Research in Applied Science and Engineering Technology, vol. 13, no. 4, pp. 455–459, Apr. 2025.

G. Sadikova, M. Amreev, O. Manankova, A. Mukasheva, and T. Serikov, "Analysis and Research of Tasks for Optimizing Flows in Multiservice Networks Based on the Principles of a Systems Approach," Journal of Theoretical and Applied Information Technology, vol. 100, no. 9, pp. 2811–2825, 2022.

S. Velednitsky, "The Future of Network Monitoring: How AI and Machine Learning Are Changing the Game," Security, Feb. 2025.

J. Sanusi, S. Adeshina, A. M. Aibinu, O. Oshiga, R. Prasad, and A. Dayyabu, "Mobility Prediction Algorithms for Handover Management in Heterogeneous LiFi and RF Networks: An Ensemble Approach," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18300–18306, Dec. 2024.

Z. Wang et al., "Failure prediction using machine learning and time series in optical network," Optics Express, vol. 25, no. 16, Aug. 2017, Art. no. 18553.

S. Schmidl, P. Wenig, and T. Papenbrock, "Anomaly detection in time series: a comprehensive evaluation," Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1779–1797, May 2022.

P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, "Long Short Term Memory Networks for Anomaly Detection in Time Series," in the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, Arp 2015, pp. 89–94.

L. Bounia and I. Setitra, "Computing Improved Explanations for Random Forests: k-Majoritary Reasons:," in Proceedings of the 17th International Conference on Agents and Artificial Intelligence, Porto, Portugal, 2025, pp. 188–198.

M. Z. Yakubova, O. A. Manankova, K. A. Tashev, and G. S. Sadikova, "Methodology of the Determining for Pearson’s Criterion based on Researching the Value of Delays in the Transmitting of Information over a Multiservice Network," in 2020 International Conference on Information Science and Communications Technologies (ICISCT), Tashkent, Uzbekistan, Nov. 2020, pp. 1–5.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2009.

P. Wyrwiński and K. Krawiec, "Learning Semantics-aware Search Operators for Genetic Programming." arXiv, 2025.

J. Brownlee, "How to Perform Feature Selection With Numerical Input Data," MachineLearningMastery.com, Jun. 04, 2020. https://www.machinelearningmastery.com/feature-selection-with-numerical-input-data/.

V. N. Vapnik, "Complete Statistical Theory of Learning," Automation and Remote Control, vol. 80, no. 11, pp. 1949–1975, Nov. 2019.

M. Ali, I. Ullah, W. Noor, A. Sajid, A. Basit, and J. Baber, "Predicting the Session of an P2P IPTV User through Support Vector Regression (SVR)," Engineering, Technology & Applied Science Research, vol. 10, no. 4, pp. 6021–6026, Aug. 2020.

Cisco Systems, PPPoE Subscriber Management Configuration Guide. San Jose, California, United States: Cisco, 2021.

K. Venkatachalam, P. Prabhu, B. S. Balaji, M. Abouhawwash, and R. Rajadevi, "Recursive Feature Elimination with Ridge Regression (L2) Machine Learning Hybrid Feature Selection Algorithm for Diabetic Prediction using Random Forest Classifer." In Review, Jul. 23, 2021.

A. C. Cardall, R. C. Hales, K. B. Tanner, G. P. Williams, and K. N. Markert, "LASSO (L1) Regularization for Development of Sparse Remote-Sensing Models with Applications in Optically Complex Waters Using GEE Tools," Remote Sensing, vol. 15, no. 6, Mar. 2023, Art. no. 1670.

S. Obata, C. J. Cieszewski, R. C. Lowe, and P. Bettinger, "Random Forest Regression Model for Estimation of the Growing Stock Volumes in Georgia, USA, Using Dense Landsat Time Series and FIA Dataset," Remote Sensing, vol. 13, no. 2, Jan. 2021, Art. no. 218.

Y. Kovalova, T. Babenko, O. Oksiiuk, and L. Myrutenko, "Optimization of Lifetime in Wireless Monitoring Networks," International Journal of Computing, vol. 19, no. 2, pp. 267–272, Jun. 2020.

M. Quade, M. Abel, K. Shafi, R. K. Niven, and B. R. Noack, "Prediction of dynamical systems by symbolic regression," Physical Review E, vol. 94, no. 1, Jul. 2016, Art. no. 012214.

J. Gao, "R-Squared (R2) – How Much Variation Is Explained?," Research Methods in Medicine & Health Sciences, vol. 5, no. 4, pp. 104–109, Sep. 2024.

E. S. Vitulyova, D. K. Matrassulova, and I. E. Suleimenov, "Construction of Generalized Rademacher Functions in Terms of Ternary Logic: Solving the Problem of Visibility of Using Galois Fields for Digital Signal Processing," International Journal of Electronics and Telecommunications, vol. 68, no. 2, pp. 237–244, Dec. 2021.

I. E. Suleimenov, Y. S. Vitulyova, and D. K. Matrassulova, "Features of digital signal processing algorithms using Galois fields GF(2n+1)," PLOS ONE, vol. 18, no. 10, Oct. 2023, Art. no. e0293294.

Downloads

How to Cite

[1]
A. Zhunussov, A. Baikenov, T. Serikov, O. Abramkina, and Y. Vitulyova, “A Statistical and Machine Learning Analysis of the Significant Features of PPPoE Sessions for Quality Monitoring”, Eng. Technol. Appl. Sci. Res., vol. 15, no. 5, pp. 26923–26934, Oct. 2025.

Metrics

Abstract Views: 71
PDF Downloads: 20

Metrics Information