Scalable Federated Learning for Massive Medical Image Classification: Tackling Noisy and Imbalanced Data

Authors

  • Hadjir Zemmouri MISC Laboratory, Abdelhamid Mehri University, Constantine, Algeria
  • Akram Kout Setif Ferhat Abbas University, Setif, Algeria | MISC Laboratory, Abdelhamid Mehri University, Constantine, Algeria
  • Said Labed MISC Laboratory, Abdelhamid Mehri University, Constantine, Algeria
Volume: 15 | Issue: 5 | Pages: 26978-26984 | October 2025 | https://doi.org/10.48084/etasr.12040

Abstract

The proliferation of medical imaging data, coupled with stringent privacy regulations, necessitates scalable and reliable classification methods to address the challenges of big data. Motivated by the critical need for accurate pneumonia detection from Chest X-rays (CXR) across diverse clinical settings, this research confronts the five fundamental challenges of big data: volume, velocity, variety, value, and veracity. In such scenarios, traditional centralized methods are often constrained by data heterogeneity, computational bottlenecks, and privacy risks. To overcome these constraints, we propose a Federated Learning (FL) system that distributes model training across multiple clients, thereby ensuring data privacy while efficiently managing large-scale datasets. Our method leverages transfer learning by fine-tuning a pre-trained VGG11 model and employs FedProx regularization to mitigate client drift arising from non-Independent and non-Identically Distributed (non-IID) data distributions. Furthermore, we introduce an innovative data partitioning technique that simulates real-world conditions by generating imbalanced label distributions with a Dirichlet process and injecting Gaussian noise to mimic image quality variations. By enabling distributed local training and dynamic learning rate adjustments, our approach effectively manages high-volume, high-velocity data while preserving data privacy. Experimental results demonstrate that our proposed method efficiently aggregates diverse and noisy client updates while achieving competitive performance in pneumonia classification.

Keywords:

federated learning, big data classification, distributed training, class imbalance, pneumonia

Downloads

Download data is not yet available.

References

G. Gana et al., "Development and performance testing of a deep learning computer-aided diagnosis system for chest X-rays," European Respiratory Journal, vol. 60, no. suppl. 66, 2022.

P. Chakraborty and C. Tharini, "Pneumonia and Eye Disease Detection using Convolutional Neural Networks," Engineering, Technology & Applied Science Research, vol. 10, no. 3, pp. 5769–5774, Jun. 2020.

N. Kumar, A. Hashmi, M. Gupta, and A. Kundu, "Automatic Diagnosis of Covid-19 Related Pneumonia from CXR and CT-Scan Images," Engineering, Technology & Applied Science Research, vol. 12, no. 1, pp. 7993–7997, Feb. 2022.

H. Zemmouri, S. Labed, and A. Kout, "A survey of parallel clustering algorithms based on vertical scaling platforms for big data," in 2022 4th International Conference on Pattern Analysis and Intelligent Systems (PAIS), Oum El Bouaghi, Algeria, Oct. 2022, pp. 1–8.

A. Z. Tan, H. Yu, L. Cui, and Q. Yang, "Towards Personalized Federated Learning," IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 12, pp. 9587–9603, Dec. 2023.

Y. Wang, M. M. Rosli, N. Musa, and F. Li, "Multi-Class Imbalanced Data Classification: A Systematic Mapping Study," Engineering, Technology & Applied Science Research, vol. 14, no. 3, pp. 14183–14190, Jun. 2024.

S. Ram, Y. N. Kiran, A. Bhute, and T. Khare, "Federated Learning for Accurate Labeling of Chest X-Ray Scans," in 2024 36th Conference of Open Innovations Association (FRUCT), Lappeenranta, Finland, Oct. 2024, pp. 649–654.

H. Zhu, J. Xu, S. Liu, and Y. Jin, "Federated learning on non-IID data: A survey," Neurocomputing, vol. 465, pp. 371–390, Nov. 2021.

S. Sharma, K. Guleria, and A. Dogra, "FedPneu: Federated Learning for Pneumonia Detection across Multiclient Cross-Silo Healthcare Datasets," Current Medical Imaging Reviews, vol. 21, Mar. 2025.

A. Mabrouk, R. P. D. Redondo, M. A. Elaziz, and M. Kayed, "Ensemble Federated Learning: An approach for collaborative pneumonia diagnosis," Applied Soft Computing, vol. 144, Sep. 2023, Art. no. 110500.

P. Kulkarni, A. Kanhere, P. H. Yi, and V. S. Parekh, "From Isolation to Collaboration: Federated Class-Heterogeneous Learning for Chest X-Ray Classification." arXiv, Nov. 15, 2024.

P. R. Kaur, A. Sharma, I. Singh, and R. Malhotra, "Deep Learning-Based Pneumonia Recognition from Chest X-Ray Images," International Journal of Performability Engineering, vol. 18, no. 5, 2022, Art. no. 380.

M. Nawaz, T. Nazir, J. Baili, M. A. Khan, Y. J. Kim, and J.-H. Cha, "CXray-EffDet: Chest Disease Detection and Classification from X-ray Images Using the EfficientDet Model," Diagnostics, vol. 13, no. 2, Jan. 2023, Art. no. 248.

P. Rajpurkar et al., "CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning." arXiv, 2017.

V. Iglovikov and A. Shvets, "TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation." arXiv, 2018.

C. Shorten and T. M. Khoshgoftaar, "A survey on Image Data Augmentation for Deep Learning," Journal of Big Data, vol. 6, no. 1, Dec. 2019.

N. Kumar, J. Manzar, Shivani, and S. Garg, "Underwater Image Enhancement using Deep Learning," Multimedia Tools and Applications, vol. 82, no. 30, pp. 46789–46809, Dec. 2023.

J. Lin, "On The Dirichlet Distribution," Department of Mathematics and Statistics, Queens University, 2016.

Chest X-Ray Images (Pneumonia). (2018), Kaggle. [Online]. Available: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia.

Downloads

How to Cite

[1]
H. Zemmouri, A. Kout, and S. Labed, “Scalable Federated Learning for Massive Medical Image Classification: Tackling Noisy and Imbalanced Data”, Eng. Technol. Appl. Sci. Res., vol. 15, no. 5, pp. 26978–26984, Oct. 2025.

Metrics

Abstract Views: 81
PDF Downloads: 30

Metrics Information