Abnormal Human Behavior Detection Improvement with an Efficient Attention Block

Anh Dung Ho; Huong Giang Doan; Ngoc Trung Nguyen

doi:10.48084/etasr.11463

Authors

Anh Dung Ho Department of Information Technology, East Asia University of Technology, Hanoi, Vietnam
Huong Giang Doan Faculty of Control and Automation, Electric Power University, Hanoi, Vietnam
Ngoc Trung Nguyen Department of Personnel Organization and Administration, Electric Power University, Hanoi, Vietnam

Volume: 15 | Issue: 4 | Pages: 25048-25054 | August 2025 | https://doi.org/10.48084/etasr.11463

Received: 13 April 2025 | Revised: 30 April 2025 and 6 May 2025 | Accepted: 10 May 2025 | Online: 9 July 2025

Corresponding author: Ngoc Trung Nguyen

Abstract

Convolution Neural Networks (CNNs) have become an attractive method for the detection of anomalous behaviors. However, designing an efficient CNN model in terms of classification accuracy remains a challenging problem. Furthermore, the existing datasets for abnormal behavior detection are limited, with each focusing on a certain context. Therefore, a CNN model trained on a certain dataset will be adaptive for a particular context and not suitable for other contexts. This study proposes a CNN framework with an efficient attention mechanism to capture key information from multiple inputs, namely RGB, optical flow, and heatmap. Experiments were carried out on several benchmark datasets and a self-collected dataset, and the evaluation involved both single- and cross-dataset strategies. The results show the superior performance of the proposed frameworks compared to other SOTA methods in detection accuracy.

Keywords:

knowledge distillation, convolutional neural network, transfer learning, deep learning, student-teacher model

Downloads

Download data is not yet available.

References

N. C. Tay, C. Tee, T. S. Ong, and P. S. Teh, "Abnormal Behavior Recognition using CNN-LSTM with Attention Mechanism," in 2019 1st International Conference on Electrical, Control and Instrumentation Engineering (ICECIE), Kuala Lumpur, Malaysia, Nov. 2019, pp. 1–5. DOI: https://doi.org/10.1109/ICECIE47765.2019.8974824

A. Gangwar, V. González-Castro, E. Alegre, and E. Fidalgo, "AttM-CNN: Attention and metric learning based CNN for pornography, age and Child Sexual Abuse (CSA) Detection in images," Neurocomputing, vol. 445, pp. 81–104, Jul. 2021. DOI: https://doi.org/10.1016/j.neucom.2021.02.056

P. Kuppusamy and C. Harika, "Human Action Recognition using CNN and LSTM-RNN with Attention Model," International Journal of Innovative Technology and Exploring Engineering, vol. 8, no. 8, pp. 1639–1643, 2019.

W. Ullah, A. Ullah, T. Hussain, Z. A. Khan, and S. W. Baik, "An Efficient Anomaly Recognition Framework Using an Attention Residual LSTM in Surveillance Videos," Sensors, vol. 21, no. 8, Jan. 2021, Art. no. 2811. DOI: https://doi.org/10.3390/s21082811

H. Idrees, I. Saleemi, C. Seibert, and M. Shah, "Multi-source Multi-scale Counting in Extremely Dense Crowd Images," in 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, Jun. 2013, pp. 2547–2554. DOI: https://doi.org/10.1109/CVPR.2013.329

R. Mehran, A. Oyama, and M. Shah, "Abnormal crowd behavior detection using social force model," in 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, Jun. 2009, pp. 935–942. DOI: https://doi.org/10.1109/CVPR.2009.5206641

C. Lu, J. Shi, and J. Jia, "Abnormal Event Detection at 150 FPS in MATLAB," in 2013 IEEE International Conference on Computer Vision, Sydney, Australia, Dec. 2013, pp. 2720–2727. DOI: https://doi.org/10.1109/ICCV.2013.338

X. Zheng, Y. Zhang, Y. Zheng, F. Luo, and X. Lu, "Abnormal event detection by a weakly supervised temporal attention network," CAAI Transactions on Intelligence Technology, vol. 7, no. 3, pp. 419–431, 2022. DOI: https://doi.org/10.1049/cit2.12068

S. Liu, X. Ma, H. Wu, and Y. Li, "An End to End Framework With Adaptive Spatio-Temporal Attention Module for Human Action Recognition," IEEE Access, vol. 8, pp. 47220–47231, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.2979549

L. He, S. Wen, L. Wang, and F. Li, "Vehicle theft recognition from surveillance video based on spatiotemporal attention," Applied Intelligence, vol. 51, no. 4, pp. 2128–2143, Apr. 2021. DOI: https://doi.org/10.1007/s10489-020-01933-8

G. Yang et al., "STA-TSN: Spatial-Temporal Attention Temporal Segment Network for action recognition in video," PLOS ONE, vol. 17, no. 3, 2022, Art. no. e0265115. DOI: https://doi.org/10.1371/journal.pone.0265115

A. D. Ho, H. G. Doan, and T. T. T. Pham, "Multi-Modality Abnormal Crowd Detection with Self-Attention and Knowledge Distillation," Engineering, Technology & Applied Science Research, vol. 14, no. 5, pp. 16674–16679, Oct. 2024. DOI: https://doi.org/10.48084/etasr.8194

Y. Liu, J. Yan, and W. Ouyang, "Quality Aware Network for Set to Set Recognition," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 4694–4703. DOI: https://doi.org/10.1109/CVPR.2017.499

C. Dupont, L. Tobias, and B. Luvison, "Crowd-11: A Dataset for Fine Grained Crowd Behaviour Analysis," in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, Jul. 2017, pp. 2184–2191. DOI: https://doi.org/10.1109/CVPRW.2017.271

J. Shao, K. Kang, C. C. Loy, and X. Wang, "Deeply learned attributes for crowded scene understanding," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, Jun. 2015, pp. 4657–4666. DOI: https://doi.org/10.1109/CVPR.2015.7299097

J. Shao, C. C. Loy, and X. Wang, "Scene-Independent Group Profiling in Crowd," in 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, Jun. 2014, pp. 2227–2234. DOI: https://doi.org/10.1109/CVPR.2014.285

T. Hassner, Y. Itcher, and O. Kliper-Gross, "Violent flows: Real-time detection of violent crowd behavior," in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2012, pp. 1–6. DOI: https://doi.org/10.1109/CVPRW.2012.6239348

C. Zhang, H. Li, X. Wang, and X. Yang, "Cross-scene crowd counting via deep convolutional neural networks," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2015, pp. 833–841.

P. Allain, N. Courty, and T. Corpetti, "AGORASET: a dataset for crowd video analysis," in 1st ICPR International Workshop on Pattern Recognition and Crowd Analysis, Tsukuba, Japan, Aug. 2012.

T. Ellis, "Performance Metrics and Methods for Tracking in Surveillance," in Proceedings 3rd IEEE International Workshop on PETS, Copenhagen, Denmark, 2002.

E. Bermejo Nievas, O. Deniz Suarez, G. Bueno García, and R. Sukthankar, "Violence Detection in Video Using Computer Vision Techniques," in Computer Analysis of Images and Patterns, 2011, pp. 332–339. DOI: https://doi.org/10.1007/978-3-642-23678-5_39

B. Leibe, E. Seemann, and B. Schiele, "Pedestrian Detection in Crowded Scenes," in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 2005, vol. 1, pp. 878–885. DOI: https://doi.org/10.1109/CVPR.2005.272

A. Acsintoae et al., "UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection," in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2022, pp. 20111–20121. DOI: https://doi.org/10.1109/CVPR52688.2022.01951

W. Luo, W. Liu, and S. Gao, "A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework," in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, Oct. 2017, pp. 341–349. DOI: https://doi.org/10.1109/ICCV.2017.45

Y. M. Bai, Y. Wang, and S. S. Wu, "Detection of Abnormal Human Behavior in Video Images based on a Hybrid Approach," nternational Journal of Advanced Computer Science and Applications, vol. 13, no. 11, pp. 346–356, 2022. DOI: https://doi.org/10.14569/IJACSA.2022.0131138

H. Bagherinezhad and S. Y. Soltani, "Abnormal Human Behavior Detection System in Video Surveillance Systems." Social Science Research Network, May 11, 2022. DOI: https://doi.org/10.2139/ssrn.4106323

M. I. Georgescu, R. T. Ionescu, F. S. Khan, M. Popescu, and M. Shah, "A Background-Agnostic Framework With Adversarial Training for Abnormal Event Detection in Video," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4505–4523, Sep. 2022.

Abnormal Human Behavior Detection Improvement with an Efficient Attention Block

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License