A Comprehensive Approach for Thyroid Cancer Prediction Using Machine Learning Models
Received: 6 June 2025 | Revised: 26 June 2025, 17 July 2025, 25 July 2025, and 2 August 2025 | Accepted: 3 August 2025 | Online: 6 October 2025
Corresponding author: S. Santhoshini
Abstract
This study sought to predict the appearance of thyroid cancer by employing machine learning methods on an extensive collection of clinical and demographic variables. The Random Forest (RF) algorithm is the foundation of the prediction model, which combines diverse data sources to enhance its predictive accuracy. The preprocessing steps involved handling missing values, normalizing data, and selecting relevant features, ensuring high-quality inputs for the model. The RF model demonstrated high recall, precision, and accuracy in the prediction of thyroid cancer, validated through rigorous cross-validation techniques. The results highlight the potential of machine learning to improve early and timely detection and management of thyroid cancer, thereby leading to better patient outcomes. A user-friendly Flask-based frontend was developed to make real-time risk predictions accessible to healthcare professionals.
Keywords:
thyroid cancer, machine learning, random forest, data preprocessing, real-time predictionsDownloads
References
Z. Lyu, Y. Zhang, C. Sheng, Y. Huang, Q. Zhang, and K. Chen, "Global burden of thyroid cancer in 2022: Incidence and mortality estimates from GLOBOCAN," Chinese Medical Journal, vol. 137, no. 21, pp. 2567–2576, Nov. 2024.
S. Hu, X. Wu, and H. Jiang, "Trends and projections of the global burden of thyroid cancer from 1990 to 2030," Journal of Global Health, vol. 14, 2024, Art. no. 04084.
D. F. Sigmon and S. Fatima, "Fine Needle Aspiration," in StatPearls, Treasure Island, FL, USA: StatPearls Publishing, 2025.
İ. B. Çiçek and Z. Küçükakçalı, "Machine Learning Approach for Thyroid Cancer Diagnosis Using Clinical Data," Middle Black Sea Journal of Health Science, vol. 9, no. 3, pp. 440–452, Aug. 2023.
N. M. Xi, L. Wang, and C. Yang, "Improving the diagnosis of thyroid cancer by machine learning and clinical data," Scientific Reports, vol. 12, no. 1, Jul. 2022, Art. no. 11143.
M. A. Begum, I. M. Tresa, S. Sandhya, S. Vidhya, and G. Vinodhini, "Machine learning based dysfunction thyroid cancer detection with optimal analysis," Turkish Journal of Computer and Mathematics Education, vol. 12, no. 7, pp. 818–823, 2021.
S. A. Nasr, H. M. Abdel-Fattah, M. M. Abdelsalam, and H. El-Din Moustafa, "An Accurate Deep Learning Based Framework for Detection of Thyroid Cancer Using Ultrasound Images," International Journal of Chemical and Biochemical Sciences, vol. 24, no. 12, pp. 455–468, 2023.
P. Poudel, A. Illanes, E. J. G. Ataide, N. Esmaeili, S. Balakrishnan, and M. Friebe, "Thyroid Ultrasound Texture Classification Using Autoregressive Features in Conjunction With Machine Learning Approaches," IEEE Access, vol. 7, pp. 79354–79365, 2019.
K. E. Setiawan, "Predicting recurrence in differentiated thyroid cancer: a comparative analysis of various machine learning models including ensemble methods with chi-squared feature selection," Communications in Mathematical Biology and Neuroscience, vol. 2024, Apr. 2024, Art. no. 55.
V. V. Vadhiraj, A. Simpkin, J. O’Connell, N. Singh Ospina, S. Maraka, and D. T. O’Keeffe, "Ultrasound Image Classification of Thyroid Nodules Using Machine Learning Techniques," Medicina, vol. 57, no. 6, Jun. 2021, Art. no. 527.
J. Gu et al., "A machine learning-based approach to predicting the malignant and metastasis of thyroid cancer," Frontiers in Oncology, vol. 12, Dec. 2022.
Y. Habchi et al., "AI in Thyroid Cancer Diagnosis: Techniques, Trends, and Future Directions," Systems, vol. 11, no. 10, Oct. 2023, Art. no. 519.
S. R. Shih et al., "Computerized Cytological Features for Papillary Thyroid Cancer Diagnosis—Preliminary Report," Cancers, vol. 11, no. 11, Nov. 2019, Art. no. 1645.
H. Wang et al., "Development and validation of prediction models for papillary thyroid cancer structural recurrence using machine learning approaches," BMC Cancer, vol. 24, no. 1, Apr. 2024, Art. no. 427.
X. Li et al., "Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: a retrospective, multicohort, diagnostic study," The Lancet Oncology, vol. 20, no. 2, pp. 193–201, Feb. 2019.
M. Mourad et al., "Machine Learning and Feature Selection Applied to SEER Data to Reliably Assess Thyroid Cancer Prognosis," Scientific Reports, vol. 10, no. 1, Mar. 2020, Art. no. 5176.
B. Zhang et al., "Machine Learning–Assisted System for Thyroid Nodule Diagnosis," Thyroid®, vol. 29, no. 6, pp. 858–867, Jun. 2019.
S. Anari, N. Tataei Sarshar, N. Mahjoori, S. Dorosti, and A. Rezaie, "Review of Deep Learning Approaches for Thyroid Cancer Diagnosis," Mathematical Problems in Engineering, vol. 2022, no. 1, 2022, Art. no. 5052435.
R. Chaganti, F. Rustam, I. De La Torre Díez, J. L. V. Mazón, C. L. Rodríguez, and I. Ashraf, "Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques," Cancers, vol. 14, no. 16, Jan. 2022, Art. no. 3914.
R. Quinlan, "Thyroid Disease." UCI Machine Learning Repository, 1986.
I. O. Lixandru-Petre et al., "Machine Learning for Thyroid Cancer Detection, Presence of Metastasis, and Recurrence Predictions—A Scoping Review," Cancers, vol. 17, no. 8, Jan. 2025, Art. no. 1308.
N. M. Xi, L. Wang, and Chuanjia Yang, "Improving The Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data." Zenodo, Apr. 16, 2022.
"DDTI: Thyroid Ultrasound Images." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/dasmehdixtr/ddti-thyroid-ultrasound-images.
Downloads
How to Cite
License
Copyright (c) 2025 S. Santhoshini, M. A. Goutham

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.