An Efficient Face Detection and Gender Classification Approach Integrating the Speed of YOLOv9 with the Accuracy of ResNet50

Authors

  • Aseil Nadhim Kadhim Faculty of Artificial Intelligence, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
  • Syahid Anuar Faculty of Artificial Intelligence, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
  • Saiful Adli Bin Ismail Faculty of Artificial Intelligence, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Volume: 15 | Issue: 5 | Pages: 28377-28385 | October 2025 | https://doi.org/10.48084/etasr.11830

Abstract

Artificial Intelligence (AI), particularly deep learning models, plays a pivotal role in tasks such as face detection and gender classification. This study aims to enhance the performance of computer vision systems by developing a hybrid model that integrates YOLOv9 with a modified version of ResNet50, addressing the common trade-off between accuracy and inference speed found in traditional approaches. To achieve this, a custom dataset was collected from real-world conditions within a university campus, testing the performance of multiple You Only Look Once (YOLO) models and Convolutional Neural Network (CNN) architectures. Experimental results revealed that YOLOv9 achieved the highest inference speed of 332 ms/image at 3.00 Frames Per Second (FPS), while ResNet50 demonstrated superior accuracy in gender classification as a two-stage detection model, albeit with slower performance. To resolve this trade-off, ResNet50 was modified for both speed and accuracy, and then structurally embedded into the YOLOv9 framework. Specifically, the Cross Stage Partial Network (CSPNet) and Efficient Layer Aggregation Network (ELAN) layers of YOLOv9 were replaced with modified ResNet50 feature extractors, while the Global Local Attention Network (GLAN) layer was retained to preserve effective feature fusion. This integration significantly improved both facial and object detection performance. The proposed hybrid model outperformed individual models, achieving a peak mean Average Precision (mAP) of 97.2%, with 97% precision, 93.4% recall, and an inference speed of 103.89 ms/image (9.62 FPS). These results demonstrate that the proposed model effectively balances accuracy and speed, making it highly suitable for real-time applications such as smart surveillance and security systems.

Keywords:

deep learning, face detection, gender classification, You Only Look Once (YOLO), ResNet50

Downloads

Download data is not yet available.

References

N. Manakitsa, G. S. Maraslidis, L. Moysis, and G. F. Fragulis, "A Review of Machine Learning and Deep Learning for Object Detection, Semantic Segmentation, and Human Action Recognition in Machine and Robotic Vision," Technologies, vol. 12, no. 2, Jan. 2024, Art. no. 15.

M. Karahan, F. Lacinkaya, K. Erdonmez, E. D. Eminağaoğlu, and C. Kasnakoğlu, "Age and Gender Classification from Facial Features and Object Detection with Machine Learning," Journal of Fuzzy Extension and Applications, vol. 3, no. 3, pp. 219-230, Apr. 2022.

P. T. Anh, N. K. Diep, and N. V Trong, "Convolutional neural networks for image object recognition and classification with large-scale and complex data," Science & Technology Development Journal-Engineering and Technology, vol. 6, no. SI8, pp. 10–18, Dec. 2024.

M. A. B. Zuraimi and F. H. K. Zaman, "Vehicle Detection and Tracking using YOLO and DeepSORT," in 2021 IEEE 11th IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang, Malaysia, Apr. 2021, pp. 23–29.

P. Mahto, P. Garg, P. Seth, and J. Panda, "Refining Yolov4 for Vehicle Detection", International Journal of Advanced Research in Engineering and Technology (IJARET), vol. 11, no. 5, pp. 409-419, Jul. 2020.

F. E. Ayo, A. M. Mustapha, J. A. Braimah, and D. A. Aina, "Geometric Analysis and YOLO Algorithm for Automatic Face Detection System in a Security Setting," Journal of Physics: Conference Series, vol. 2199, no. 1, Feb. 2022, Art. no. 012010.

E. K. Varnima and C. Ramachandran, "Real-time Gender Identification from Face Images using you only look once (yolo)," in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184), Tirunelveli, India, Jun. 2020, pp. 1074–1077.

W. Ali, W. Tian, S. U. Din, D. Iradukunda, and A. A. Khan, "Classical and modern face recognition approaches: a complete review," Multimedia Tools and Applications, vol. 80, no. 3, pp. 4825–4880, Jan. 2021.

S. Umer, B. C. Dhara, and B. Chanda, "Face recognition using fusion of feature learning techniques," Measurement, vol. 146, pp. 43–54, Nov. 2019.

W. Chen, H. Huang, S. Peng, C. Zhou, and C. Zhang, "YOLO-face: a real-time face detector," The Visual Computer, vol. 37, no. 4, pp. 805–813, Apr. 2021.

A. N. Kadhum and A. N. Kadhum, "Literature Survey on YOLO Models for Face Recognition in Covid-19 Pandemic," Journal of Image Processing and Intelligent Remote Sensing, no. 34, pp. 27–35, Jul. 2023.

S. Wang, "Design of smart community access control system based on SSD and OneNET cloud platform," in 3rd International Conference on Internet of Things and Smart City (IoTSC 2023), Chongqing, China, Jun. 2023, Art. no. 104.

W. Chen, Y. Qiao, and Y. Li, "Inception-SSD: An improved single shot detector for vehicle detection," Journal of Ambient Intelligence and Humanized Computing, vol. 13, no. 11, pp. 5047–5053, Nov. 2022.

A. Dhillon and G. K. Verma, "Convolutional neural network: a review of models, methodologies and applications to object detection," Progress in Artificial Intelligence, vol. 9, no. 2, pp. 85–112, Jun. 2020.

R. Kaur and S. Singh, "A comprehensive review of object detection with deep learning," Digital Signal Processing, vol. 132, Jan. 2023, Art. no. 103812.

A. Mustafa and K. Meehan, "Gender Classification and Age Prediction using CNN and ResNet in Real-Time," in 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), Sakheer, Bahrain, Oct. 2020, pp. 1–6.

H. Jiang and E. Learned-Miller, "Face Detection with the Faster R-CNN," in 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, DC, USA, May 2017, pp. 650–657.

M. Othmani, "A vehicle detection and tracking method for traffic video based on faster R-CNN," Multimedia Tools and Applications, vol. 81, no. 20, pp. 28347–28365, Aug. 2022.

S. D. Meena, C. S. Siri, P. S. Lakshmi, N. S. Doondı, and J. Sheela, "Real time DNN-based Face Mask Detection System using MobileNetV2 and ResNet50," in 2023 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, Apr. 2023, pp. 1007–1015.

X. Zhao, "Research and application of deep learning in image recognition," Journal of Physics: Conference Series, vol. 2425, no. 1, Feb. 2023, Art. no. 012047.

A. Ahmed and F. S. Alghareb, "A Hybrid ROI Extraction Approach for Mask and Unmask Facial Recognition System using Light-CNN," International Journal of Computing and Digital Systems, vol. 15, no. 1, pp. 1223–1232, Sep. 2024.

C. Lin, X. Hu, Y. Zhan, and X. Hao, "MobileNetV2 with Spatial Attention module for traffic congestion recognition in surveillance images," Expert Systems with Applications, vol. 255, Dec. 2024, Art. no. 124701.

S. Dodia, V. Meshram, J. Kasle, S. Gomase, H. Amrit, and R. Sarse, "Autism Spectrum Disorder (ASD) Detection from Facial Images using MobileNet," in 2024 IEEE 9th International Conference for Convergence in Technology (I2CT), Pune, India, Apr. 2024, pp. 1–7.

J. Zhang et al., "Hyperspectral Image Classification Based on Dense Pyramidal Convolution and Multi-Feature Fusion," Remote Sensing, vol. 15, no. 12, Art. no. 2990, Jun. 2023.

L. Kong and J. Cheng, "Classification and detection of COVID-19 X-Ray images based on DenseNet and VGG16 feature fusion," Biomedical Signal Processing and Control, vol. 77, Aug. 2022, Art. no. 103772.

J. X. Mi, J. Feng, and K.-Y. Huang, "Designing efficient convolutional neural network structure: A survey," Neurocomputing, vol. 489, pp. 139–156, Jun. 2022.

Z. Huang, X. Jiang, S. Huang, S. Qin, and S. Yang, "An efficient convolutional neural network-based diagnosis system for citrus fruit diseases," Frontiers in Genetics, vol. 14, Aug. 2023, Art. no. 1253934.

S. Ennaama, H. Silkan, A. Bentajer, and A. Tahiri, "Enhanced Real-Time Object Detection using YOLOv7 and MobileNetv3," Engineering, Technology & Applied Science Research, vol. 15, no. 1, pp. 19181–19187, Feb. 2025.

M. Shafiq and Z. Gu, "Deep Residual Learning for Image Recognition: A Survey," Applied Sciences, vol. 12, no. 18, Sep. 2022, Art. no. 8972.

P. Jabraelzadeh, A. Charmin, and M. Ebadpour, "Providing a hybrid method for face detection and gender recognition by a transfer learning and fine-tuning approach in deep convolutional neural networks and the Yolo algorithm," International Journal of Nonlinear Analysis and Applications, vol. 14, no. 1, pp. 2373-2381, Jul. 2022.

S. Minaee, P. Luo, Z. Lin, and K. Bowyer, "Going Deeper Into Face Detection: A Survey." arXiv, Apr. 13, 2021.

Z. M. Peerun and R. K. Moloo, "Real-time gender and people tracking using YOLO," in 2024 Sixth International Conference on Computational Intelligence and Communication Technologies (CCICT), Sonepat, India, Apr. 2024, pp. 448–454.

V. Viswanatha, R. K. Chandana, and A. C. Ramachandra, "Real Time Object Detection System with YOLO and CNN Models: A Review." arXiv, 2022.

A. Nowrin, S. Afroz, Md. S. Rahman, I. Mahmud, and Y.-Z. Cho, "Comprehensive Review on Facemask Detection Techniques in the Context of Covid-19," IEEE Access, vol. 9, pp. 106839–106864, 2021.

M. Patel and U. Singh, "Age and Gender Recognition using Deep Learning Technique," in 2023 3rd International Conference on Smart Data Intelligence (ICSMDI), Trichy, India, Mar. 2023, pp. 238–245.

M. N. A. Aziz, S. Mutalib, and S. Aliman, "Comparison of Face Coverings Detection Methods using Deep Learning," in 2021 2nd International Conference on Artificial Intelligence and Data Sciences (AiDAS), IPOH, Malaysia, Sep. 2021, pp. 1–6.

I. Oztel, "Human Detection System using Different Depths of the Resnet-50 in Faster R-CNN," in 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Istanbul, Turkey, Oct. 2020, pp. 1–5.

J. E. Gallagher and E. J. Oughton, "Surveying You Only Look Once (YOLO) Multispectral Object Detection Advancements, Applications, and Challenges," IEEE Access, vol. 13, pp. 7366–7395, 2025.

X. Guo, Y.-D. Zhang, S. Lu, and Z. Lu, "A Survey on Machine Learning in COVID-19 Diagnosis," Computer Modeling in Engineering & Sciences, vol. 130, no. 1, pp. 23–71, 2022.

Y. Feng, M. Gao, and Z. Zhang, "Web Service QoS Classification Based on Optimized Convolutional Neural Network," in 2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), Dalian, China, Nov. 2019, pp. 584–590.

E. Yildirim, "ResNet-based Gender Recognition on Hand Images," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 17969–17972, Dec. 2024.

P. Dey, T. Mahmud, M. S. Chowdhury, M. S. Hossain, and K. Andersson, "Human Age and Gender Prediction from Facial Images Using Deep Learning Methods," Procedia Computer Science, vol. 238, pp. 314–321, 2024.

C. Nwankpa, W. Ijomah, A. Gachagan, and S. Marshall, "Activation Functions: Comparison of trends in Practice and Research for Deep Learning." arXiv, 2018.

S. Chaudhuri et al., “Infrared Thermography of Turbulence Patterns of Operational Wind Turbine Rotor Blades Supported With High‐Resolution Photography: KI‐VISIR Dataset,” Wind Energy, vol. 28, no. 1, Jan. 2025, Art. no. e2958.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection." arXiv, 2015.

S. Mukherjee. "The Annotated ResNet 50: Explaining how ResNet 50 works and why it is so popular," Towards Data Science. [Online]. Available: https://towardsdatascience.com/the-annotated-resnet-50-a6c536034758.

T. Szandała, "Review and Comparison of Commonly Used Activation Functions for Deep Neural Networks," in Bio-inspired Neurocomputing, vol. 903, A. K. Bhoi, P. K. Mallick, C.-M. Liu, and V. E. Balas, Eds. Singapore: Springer Singapore, 2021, pp. 203–224.

Downloads

How to Cite

[1]
A. N. Kadhim, S. Anuar, and S. A. B. Ismail, “An Efficient Face Detection and Gender Classification Approach Integrating the Speed of YOLOv9 with the Accuracy of ResNet50”, Eng. Technol. Appl. Sci. Res., vol. 15, no. 5, pp. 28377–28385, Oct. 2025.

Metrics

Abstract Views: 10
PDF Downloads: 4

Metrics Information