Improving Early Autism Detection with Chi-Square Feature Selection, Machine Learning, and Explainable AI
Received: 27 June 2025 | Revised: 25 July 2025 | Accepted: 14 August 2025 | Online: 7 September 2025
Corresponding author: Aymen Abu-Errub
Abstract
This study presented a framework that utilized Chi-square feature selection and Machine Learning (ML) classifiers to improve the early detection of Autism Spectrum Disorder (ASD) for children 12 to 36 months old. Six classifiers -Light Gradient Boosting Machine (LGBM), Extra Trees (ET), Decision Tree (DT), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP) -were tested. The findings revealed that the integration of Chi-square feature selection with SVM achieved perfect accuracy, precision, recall, and F1-score, while the other models demonstrated notable gains (up to 90%). Additionally, a SHapley Additive exPlanation (SHAP) analysis was conducted to interpret the model predictions and highlight the key behavioral features, while a literature comparison with recent research showed that the current method outperformed the latter. This study demonstrated that integrating robust feature selection with explainable ML models can significantly advance the reliability of early ASD screening tools.
Keywords:
Autism Spectrum Disorder (ASD), chi-square feature selection, machine learning, explainable AIDownloads
References
G. Dawson et al., "Randomized, Controlled Trial of an Intervention for Toddlers With Autism: The Early Start Denver Model," Pediatrics, vol. 125, no. 1, pp. e17–e23, Jan. 2010.
L. Zwaigenbaum et al., "Early Intervention for Children With Autism Spectrum Disorder Under 3 Years of Age: Recommendations for Practice and Research," Pediatrics, vol. 136, no. Supplement_1, pp. S60–S81, Oct. 2015.
F. J. Rajam and B. R. K. Swakkin, "MAPLE: A Novel Processing Technique for Adult Autism Prediction," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23901–23906, June 2025.
Q. Y. Shambour, M. M. Al-Zyoud, A. H. Hussein, and Q. M. Kharma, "A doctor recommender system based on collaborative and content filtering," International Journal of Electrical and Computer Engineering (IJECE), vol. 13, no. 1, pp. 884–893, Feb. 2023.
M. Bala, M. H. Ali, M. S. Satu, K. F. Hasan, and M. A. Moni, "Efficient Machine Learning Models for Early Stage Detection of Autism Spectrum Disorder," Algorithms, vol. 15, no. 5, May 2022, Art. no. 166.
D. Theng and K. K. Bhoyar, "Feature selection techniques for machine learning: a survey of more than two decades of research," Knowledge and Information Systems, vol. 66, no. 3, pp. 1575–1637, 2024.
Z. Sadeghi et al., "A review of Explainable Artificial Intelligence in healthcare," Computers and Electrical Engineering, vol. 118, Aug. 2024, Art. no. 109370.
K. Vakadkar, D. Purkayastha, and D. Krishnan, "Detection of Autism Spectrum Disorder in Children Using Machine Learning Techniques," SN Computer Science, vol. 2, no. 5, July 2021, Art. no. 386.
M. R. Alteneiji, L. M. Alqaydi, and M. U. Tariq, "Autism Spectrum Disorder Diagnosis using Optimal Machine Learning Methods," International Journal of Advanced Computer Science and Applications, vol. 11, no. 9, 2020.
S. M. Mahedy Hasan, M. P. Uddin, M. A. Mamun, M. I. Sharif, A. Ulhaq, and G. Krishnamoorthy, "A Machine Learning Framework for Early-Stage Detection of Autism Spectrum Disorders," IEEE Access, vol. 11, pp. 15038–15057, 2023.
M. J. Uddin et al., "An Integrated Statistical and Clinically Applicable Machine Learning Framework for the Detection of Autism Spectrum Disorder," Computers, vol. 12, no. 5, May 2023, Art. no. 92.
J. Talukdar, D. K. Gogoi, and T. P. Singh, "A comparative assessment of most widely used machine learning classifiers for analysing and classifying autism spectrum disorder in toddlers and adolescents," Healthcare Analytics, vol. 3, Nov. 2023, Art. no. 100178.
A. M. Almars, I. Gad, and E.-S. Atlam, "Unlocking autistic emotions: developing an interpretable IoT-based EfficientNet model for emotion recognition in children with autism," Neural Computing and Applications, vol. 37, no. 21, pp. 17129–17148, July 2025.
E.-S. Atlam et al., "Explainable artificial intelligence systems for predicting mental health problems in autistics," Alexandria Engineering Journal, vol. 117, pp. 376–390, Apr. 2025.
H. M. Nguyen, E. W. Cooper, and K. Kamei, "Borderline over-sampling for imbalanced data classification," International Journal of Knowledge Engineering and Soft Data Paradigms, vol. 3, no. 1, pp. 4–21, Apr. 2011.
G. Ke et al., "LightGBM: A Highly Efficient Gradient Boosting Decision Tree," in 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 2017.
P. Geurts, D. Ernst, and L. Wehenkel, "Extremely randomized trees," Machine Learning, vol. 63, no. 1, pp. 3–42, Apr. 2006.
L. E. Peterson, "K-nearest neighbor," Scholarpedia, vol. 4, no. 2, Feb. 2009, Art. no. 1883.
Q. Y. Shambour, N. M. Turab, and O. Y. Adwan, "An Effective e-Commerce Recommender System Based on Trust and Semantic Information," Cybernetics and Information Technologies, vol. 21, no. 1, pp. 103–118, Mar. 2021.
J. Lu, Q. Shambour, and G. Zhang, "Recommendation technique-based government-to-business personalized e-services," in Conference of the North American Fuzzy Information Processing Society - NAFIPS, Cincinnati, OH, USA, June 2009.
Q. Shambour and J. Lu, "A Framework of Hybrid Recommendation System for Government-to-Business Personalized E-Services," in 2010 Seventh International Conference on Information Technology: New Generations, Las Vegas, NV, USA, Apr. 2010.
D. A. Pisner and D. M. Schnyer, "Support vector machine," in Machine Learning: Methods and Applications to Brain Disorders, Amsterdam, Netherlands: Elsevier, 2020, pp. 101–121.
R. Kruse, S. Mostaghim, C. Borgelt, C. Braune, and M. Steinbrecher, "Multi-layer Perceptrons," in Computational Intelligence: A Methodological Introduction, R. Kruse, S. Mostaghim, C. Borgelt, C. Braune, and M. Steinbrecher, Eds. Cham, Switzerland: Springer International Publishing, 2022, pp. 53–124.
J. Wu, X.-Y. Chen, H. Zhang, L.-D. Xiong, H. Lei, and S.-H. Deng, "Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimizationb," Journal of Electronic Science and Technology, vol. 17, no. 1, pp. 26–40, Mar. 2019.
F. Thabtah, F. Kamalov, and K. Rajab, "A new computational intelligence approach to detect autistic features for autism screening," International Journal of Medical Informatics, vol. 117, pp. 112–124, Sept. 2018.
F. Thabtah, "Autism screening data for toddlers." Kaggle, 2018.
Downloads
How to Cite
License
Copyright (c) 2025 Aymen Abu-Errub

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.