DeepCAMS: A Deep Learning Approach for Real-Time Crowd Monitoring and Suspicious Behavior Detection Using Spatial-Temporal Analysis
Received: 15 March 2025 | Revised: 3 May 2025 and 17 May 2025 | Accepted: 19 May 2025 | Online: 22 July 2025
Corresponding author: Ayman A. Alharbi
Abstract
The increasing need for robust and intelligent crowd monitoring systems has led to advances in deep learning-based solutions. However, existing methods often struggle with capturing complex crowd dynamics and detecting suspicious behaviors in real-time. This study introduces DeepCAMS (Deep Learning-based Crowd Analysis and Monitoring System), a novel architecture that integrates a Fully Convolutional Network (FCN) for spatial feature extraction and a Long Short-Term Memory (LSTM) network for temporal analysis. Unlike traditional methods, DeepCAMS addresses the limitations of static and shallow models by combining spatial and temporal insights, enabling accurate classification of crowd behaviors as Normal or Suspicious. DeepCAMS demonstrated superior performance across multiple metrics, marking a substantial improvement over traditional approaches. The ability of DeepCAMS to adapt to diverse crowd densities and identify subtle behavioral anomalies highlights its scalability and practical application in real-world surveillance. Therefore, DeepCAMS sets a new benchmark in crowd behavior analysis by offering a unified spatial-temporal framework that ensures high accuracy, adaptability, and efficiency in dynamic environments. This study not only advances the field of smart surveillance but also paves the way for future research on scalable and interpretable crowd monitoring systems.
Keywords:
crowd monitoring, deep learning, Fully Convolutional Network (FCN), Long Short-Term Memory (LSTM), public safety, JHU-CROWD dataset, smart surveillanceDownloads
References
Y. Li, "A Deep Spatiotemporal Perspective for Understanding Crowd Behavior," IEEE Transactions on Multimedia, vol. 20, no. 12, pp. 3289–3297, Sep. 2018. DOI: https://doi.org/10.1109/TMM.2018.2834873
H. Su, H. Yang, S. Zheng, Y. Fan, and S. Wei, "The Large-Scale Crowd Behavior Perception Based on Spatio-Temporal Viscous Fluid Field," IEEE Transactions on Information Forensics and Security, vol. 8, no. 10, pp. 1575–1589, Jul. 2013. DOI: https://doi.org/10.1109/TIFS.2013.2277773
S. Wang, J. Cao, and P. S. Yu, "Deep Learning for Spatio-Temporal Data Mining: A Survey," IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 8, pp. 3681–3700, Dec. 2022. DOI: https://doi.org/10.1109/TKDE.2020.3025580
W. Wang et al., "HAST-IDS: Learning Hierarchical Spatial-Temporal Features Using Deep Neural Networks to Improve Intrusion Detection," IEEE Access, vol. 6, pp. 1792–1806, 2018. DOI: https://doi.org/10.1109/ACCESS.2017.2780250
N. Li, F. Chang, and C. Liu, "Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes," IEEE Transactions on Multimedia, vol. 23, pp. 203–215, 2021. DOI: https://doi.org/10.1109/TMM.2020.2984093
E. B. Varghese, S. M. Thampi, and S. Berretti, "A Psychologically Inspired Fuzzy Cognitive Deep Learning Framework to Predict Crowd Behavior," IEEE Transactions on Affective Computing, vol. 13, no. 2, pp. 1005–1022, Apr. 2022. DOI: https://doi.org/10.1109/TAFFC.2020.2987021
Y. Miao et al., "Abnormal Behavior Learning Based on Edge Computing toward a Crowd Monitoring System," IEEE Network, vol. 36, no. 3, pp. 90–96, Feb. 2022. DOI: https://doi.org/10.1109/MNET.014.2000523
R. Nawaratne, D. Alahakoon, D. De Silva, and X. Yu, "Spatiotemporal Anomaly Detection Using Deep Learning for Real-Time Video Surveillance," IEEE Transactions on Industrial Informatics, vol. 16, no. 1, pp. 393–402, Jan. 2020. DOI: https://doi.org/10.1109/TII.2019.2938527
M. Qaraqe et al., "PublicVision: A Secure Smart Surveillance System for Crowd Behavior Recognition," IEEE Access, vol. 12, pp. 26474–26491, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3366693
S. A. Priyanka and Y. K. Wang, "Fully Symmetric Convolutional Network for Effective Image Denoising," Applied Sciences, vol. 9, no. 4, Feb. 2019, Art. no. 778. DOI: https://doi.org/10.3390/app9040778
P. K. Sahoo et al., "An Improved VGG-19 Network Induced Enhanced Feature Pooling for Precise Moving Object Detection in Complex Video Scenes," IEEE Access, vol. 12, pp. 45847–45864, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3381612
M. R. Bhuiyan, J. Abdullah, N. Hashim, and F. Al Farid, "Video analytics using deep learning for crowd analysis: a review," Multimedia Tools and Applications, vol. 81, no. 19, pp. 27895–27922, Aug. 2022. DOI: https://doi.org/10.1007/s11042-022-12833-z
Y. Zhao, X. Zhao, S. Chen, Z. Zhang, and X. Huang, "An Indoor Crowd Movement Trajectory Benchmark Dataset," IEEE Transactions on Reliability, vol. 70, no. 4, pp. 1368–1380, Sep. 2021. DOI: https://doi.org/10.1109/TR.2021.3109122
T. Yang, C. Wang, T. Zhou, Z. Cai, K. Wu, and B. Hou, "Identification of Anomalous Behavioral Patterns in Crowd Scenes," Computers, Materials & Continua, vol. 71, no. 1, pp. 925–939, 2022. DOI: https://doi.org/10.32604/cmc.2022.022147
K. Rezaee, S. M. Rezakhani, M. R. Khosravi, and M. K. Moghimi, "A survey on deep learning-based real-time crowd anomaly detection for secure distributed video surveillance," Personal and Ubiquitous Computing, vol. 28, no. 1, pp. 135–151, Feb. 2024. DOI: https://doi.org/10.1007/s00779-021-01586-5
V. Mahor, J. Choudhary, and D. P. Singh, "Analysis of Human-Based Suspicious Activity Using Bidirectional Long Sort Term Memory (Bi-LSTM)," Procedia Computer Science, vol. 260, pp. 725–733, Jan. 2025. DOI: https://doi.org/10.1016/j.procs.2025.03.252
S. K. Tripathy and P. Shanmugam, "Real-Time Spatial-Temporal Depth Separable CNN for Multi-Functional Crowd Analysis in Videos," International Journal of Image and Graphics, Nov. 2023, Art. no. 2550047. DOI: https://doi.org/10.1142/S0219467825500470
V. A. Sindagi, R. Yasarla, and V. M. Patel, "JHU-CROWD++: Large-Scale Crowd Counting Dataset and A Benchmark Method," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 5, pp. 2594–2609, Feb. 2022.
M. Reyad, A. M. Sarhan, and M. Arafa, "A modified Adam algorithm for deep neural network optimization," Neural Computing and Applications, vol. 35, no. 23, pp. 17095–17112, Aug. 2023. DOI: https://doi.org/10.1007/s00521-023-08568-z
S. Mitra, "AI-driven predictive models for traffic flow in IoT-driven smart cities," Uncertainty Discourse and Applications, vol. 1, no. 2, pp. 170–178, Dec. 2024.
S. A. Quadri and K. S. Katakdhond, "Suspicious Activity Detection Using Convolution Neural Network," Journal of Pharmaceutical Negative Results, pp. 1235–1245, Oct. 2022.
A. Dionis-Ros, J. Vila-Francés, R. Magdalena-Benedito, F. Mateo, and A. J. Serrano-López, "Multimodal Video Analysis for Crowd Anomaly Detection Using Open Access Tourism Cameras," Applied Sciences, vol. 14, no. 23, Jan. 2024, Art. no. 11075. DOI: https://doi.org/10.3390/app142311075
T. Alafif et al., "Hybrid Classifiers for Spatio-Temporal Abnormal Behavior Detection, Tracking, and Recognition in Massive Hajj Crowds," Electronics, vol. 12, no. 5, Jan. 2023, Art. no. 1165. DOI: https://doi.org/10.3390/electronics12051165
Downloads
How to Cite
License
Copyright (c) 2025 Ayman A. Alharbi

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
