A Systematic Literature Review of Deep Learning Methods for Handwritten Text Recognition in Historical Arabic Manuscripts

Authors

  • Bilal Abdulrahman Tuama Department of Emergent Computing, Faculty of Computing, University of Technology Malaysia, Malaysia | Department of Pathological Analysis, College of Applied Science, University of Samarra, Iraq
  • Farhan Mohamed Department of Emergent Computing, Faculty of Computing, University of Technology Malaysia, Malaysia
Volume: 15 | Issue: 4 | Pages: 25772-25782 | August 2025 | https://doi.org/10.48084/etasr.12123

Abstract

Arabic Handwritten Text Recognition (AHTR) in historical manuscripts poses significant challenges due to the script’s cursive nature, variability in calligraphy styles, and document degradation over time. This paper presents a Systematic Literature Review (SLR) of recent Deep Learning (DL)-based approaches applied to AHTR, focusing on methods developed between 2020 and 2025. It analyzes key DL architectures and provide a comparative analysis of the most commonly used datasets and segmentation strategies. Additionally, this review highlights essential preprocessing and postprocessing techniques that enhance the recognition performance and discusses the common evaluation metrics for AHTR. Finally, it identifies the current challenges and proposes future research directions to improve the recognition accuracy and model generalization. This review aims to guide researchers in building more robust and effective systems for the preservation and digitization of Arabic cultural heritage.

Keywords:

Arabic handwritten text recognition, historical manuscripts, deep learning, segmentation

Downloads

Download data is not yet available.

References

C. C. Tappert, C. Y. Suen, and T. Wakahara, "The state of the art in online handwriting recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 8, pp. 787–808, Aug. 1990. DOI: https://doi.org/10.1109/34.57669

B. Kada, A. Mohammed, and B. Abdelmajid, "An Optimized Approach for Handwritten Arabic Character Recognition based on the SVM Classifier," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 22232–22238, Apr. 2025. DOI: https://doi.org/10.48084/etasr.9292

K. Mohd, A. Adnan, A. Yusof, M. Ahmad, and M. Mohd Kamal, Teaching Arabic Language to Malaysian University Students using Education Technologies based on Education 4.0 Principles. International Invention, Innovative & Creative (InIIC), 2019.

H. M. Balaha, H. A. Ali, and M. Badawy, "Automatic recognition of handwritten Arabic characters: a comprehensive review," Neural Computing and Applications, vol. 33, no. 7, pp. 3011–3034, Apr. 2021. DOI: https://doi.org/10.1007/s00521-020-05137-6

Metropolitan Museum of Art (New York, N.Y.), M. Ekhtiar, and C. Moore, Eds., Art of the Islamic world: a resource for educators. New York: The Metropolitan Museum of Art, 2012.

D. Alashari and Mohd Azhar Abd. Hamid, "A Systematic Review on Arabic Calligraphy within Islamic Architecture," Ulum Islamiyyah, vol. 33, no. 1, pp. 1–15, Apr. 2021. DOI: https://doi.org/10.33102/uij.vol33no1.263

L. Mosbah, I. Moalla, T. M. Hamdani, B. Neji, T. Beyrouthy, and A. M. Alimi, "ADOCRNet: A Deep Learning OCR for Arabic Documents Recognition," IEEE Access, vol. 12, pp. 55620–55631, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3379530

M. Khayyat and L. Elrefaei, "A Deep Learning Based Prediction of Arabic Manuscripts Handwriting Style," The International Arab Journal of Information Technology, vol. 17, no. 5, pp. 702–712, Sep. 2020. DOI: https://doi.org/10.34028/iajit/17/5/3

M. J. Page et al., "The PRISMA 2020 statement: an updated guideline for reporting systematic reviews," BMJ, Mar. 2021, Art. no. n71. DOI: https://doi.org/10.1136/bmj.n71

M. S. Kasem, M. Mahmoud, and H.-S. Kang, "Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey." arXiv, 2023.

A. Nori, "Deep Learning Applications to Offline Arabic Handwriting Words Recognition Using Convolutional Neural Network," Jan. 2023.

S. Faizullah, M. S. Ayub, S. Hussain, and M. A. Khan, "A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges," Applied Sciences, vol. 13, no. 7, Apr. 2023, Art. no. 4584. DOI: https://doi.org/10.3390/app13074584

R. Farrahi Moghaddam, M. Cheriet, M. M. Adankon, K. Filonenko, and R. Wisnovsky, "IBN SINA: a database for research on processing and understanding of Arabic manuscripts images," in Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, Boston Massachusetts USA, Jun. 2010, pp. 11–18. DOI: https://doi.org/10.1145/1815330.1815332

M. Kassis, A. Abdalhaleem, A. Droby, R. Alaasam, and J. El-Sana, "VML-HD: The historical Arabic documents dataset for recognition systems," in 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR), Nancy, France, Apr. 2017, pp. 11–14. DOI: https://doi.org/10.1109/ASAR.2017.8067751

C. Clausner, A. Antonacopoulos, N. Mcgregor, and D. Wilson-Nunn, "ICFHR 2018 Competition on Recognition of Historical Arabic Scientific Manuscripts – RASM2018," in 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Niagara Falls, NY, USA, Aug. 2018, pp. 471–476. DOI: https://doi.org/10.1109/ICFHR-2018.2018.00088

B. Hakim and B. Ahror, "Text line and word detection and recognition of historical Arabic manuscripts." In Review, May 05, 2023. DOI: https://doi.org/10.21203/rs.3.rs-2883455/v1

M. Saeed et al., "Muharaf: Manuscripts of Handwritten Arabic Dataset for Cursive Text Recognition," 2024.

S. A. Mahmoud et al., "KHATT: Arabic Offline Handwritten Text Database," in 2012 International Conference on Frontiers in Handwriting Recognition, Bari, Italy, Sep. 2012, pp. 449–454. DOI: https://doi.org/10.1109/ICFHR.2012.224

Lee, David, Ismael, Safa, Grimes, Stephen, Doermann, Dave, Strassel, Stephanie, and Chen, Song, "MADCAT Phase 1 Training Set." Linguistic Data Consortium, Sep. 17, 2012, Art. no. 7771127 KB.

A. Waly et al., "Invizo: Arabic Handwritten Document Optical Character Recognition Solution." arXiv, 2025.

A. Zoizou, A. Zarghili, and I. Chaker, "MOJ-DB: A new database of Arabic historical handwriting and a novel approach for subwords extraction," Pattern Recognition Letters, vol. 159, pp. 54–60, Jul. 2022. DOI: https://doi.org/10.1016/j.patrec.2022.04.040

A. Ismail, Z. Kamel, and R. Mahmoud, "HICMA: The Handwriting Identification for Calligraphy and Manuscripts in Arabic Dataset," in Proceedings of ArabicNLP 2023, Singapore (Hybrid), 2023, pp. 24–32. DOI: https://doi.org/10.18653/v1/2023.arabicnlp-1.3

H. M. Al-Barhamtoshy, K. M. Jambi, M. A. Rashwan, and S. M. Abdou, "An Arabic Manuscript Regions Detection, Recognition and Its Applications for OCRing," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 22, no. 1, pp. 1–28, Jan. 2023. DOI: https://doi.org/10.1145/3532609

H. M. Al-Barhamtoshy and S. M. Abdou, "Arabic Manuscripts Alignment, Segmentation, Recognition, and Classification," ACM Transactions on Asian and Low-Resource Language Information Processing, Apr. 2025, Art. no. 3732939. DOI: https://doi.org/10.1145/3732939

O. Elharrouss, S. Al-Maadeed, J. M. Alja’am, and A. Hassaine, "A Robust Method for Text, Line, and Word Segmentation for Historical Arabic Manuscripts," in Data Analytics for Cultural Heritage, A. Belhi, A. Bouras, A. K. Al-Ali, and A. H. Sadka, Eds. Cham: Springer International Publishing, 2021, pp. 147–172. DOI: https://doi.org/10.1007/978-3-030-66777-1_7

B. Alrehali, N. Alsaedi, H. Alahmadi, and N. Abid, "Historical Arabic Manuscripts Text Recognition Using Convolutional Neural Network," in 2020 6th Conference on Data Science and Machine Learning Applications (CDMA), Riyadh, Saudi Arabia, Mar. 2020, pp. 37–42. DOI: https://doi.org/10.1109/CDMA47397.2020.00012

S. Elaiwat and M. Abu-Zanona, "Arabic Word Recognition System for Historical Documents using Multiscale Representation Method," International Journal of Advanced Computer Science and Applications, vol. 11, no. 4, 2020. DOI: https://doi.org/10.14569/IJACSA.2020.01104107

H. Hassen, S. Al-Madeed, and A. Bouridane, "Subword Recognition in Historical Arabic Documents using C-GRUs," TEM Journal, pp. 1630–1637, Nov. 2021. DOI: https://doi.org/10.18421/TEM104-19

L. S. Al-homed, K. M. Jambi, and H. M. Al-Barhamtoshy, "A Deep Learning Approach for Arabic Manuscripts Classification," Sensors, vol. 23, no. 19, Sep. 2023, Art. no. 8133. DOI: https://doi.org/10.3390/s23198133

B. Hakim and B. Ahror, "Deep Learning for Accurate Recognition of Arabic Handwritten Words in Historical Documents.," Procedia Computer Science, vol. 244, pp. 57–65, 2024. DOI: https://doi.org/10.1016/j.procs.2024.10.178

A. Chan, A. Mijar, M. Saeed, C.-W. Wong, and A. Khater, "HATFormer: Historic Handwritten Arabic Text Recognition with Transformers." arXiv, 2024.

S. Aabed and A. Khairaldin, "An End-to-End, Segmentation-Free, Arabic Handwritten Recognition Model on KHATT." arXiv, 2024.

S. Faizullah, M. S. Ayub, T. Alghamdi, T. S. Ali, M. A. Khan, and E. Nabil, "Revolutionizing Historical Document Digitization: LSTM-Enhanced OCR for Arabic Handwritten Manuscripts," International Journal of Advanced Computer Science and Applications, vol. 15, no. 10, 2024. DOI: https://doi.org/10.14569/IJACSA.2024.01510120

K. Miloud, M. L. Abdelmounaim, B. Mohammed, and B. R. Ilyas, "Restoration of ancient Arabic manuscripts: a deep learning approach," STUDIES IN ENGINEERING AND EXACT SCIENCES, vol. 5, no. 2, Sep. 2024, Art. no. e7722. DOI: https://doi.org/10.54021/seesv5n2-183

A. A.Alqahtani and S. S.Alfahmi, "System for Detection and Recognition of Historical Arabic Manuscripts." In Review, Feb. 03, 2025. DOI: https://doi.org/10.21203/rs.3.rs-5936450/v1

R. Najam and S. Faizullah, "Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction," Applied Sciences, vol. 13, no. 13, Jun. 2023, Art. no. 7568. DOI: https://doi.org/10.3390/app13137568

A. Mostafa et al., "OCFormer: A Transformer-Based Model For Arabic Handwritten Text Recognition," in 2021 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, May 2021, pp. 182–186. DOI: https://doi.org/10.1109/MIUCC52538.2021.9447608

S. Momeni and B. BabaAli, "A Transformer-based Approach for Arabic Offline Handwritten Text Recognition." arXiv, 2023. DOI: https://doi.org/10.21203/rs.3.rs-2300065/v1

S. Djaghbellou, A. Attıa, and A. Bouzıane, "A Survey on Text-Line Segmentation in Arab Historical Manuscripts," International Journal of Informatics and Applied Mathematics, vol. 7, no. 1, pp. 14–32, Jun. 2024. DOI: https://doi.org/10.53508/ijiam.1407236

S. Alghyaline, "Arabic Optical Character Recognition: A Review," Computer Modeling in Engineering & Sciences, vol. 135, no. 3, pp. 1825–1861, 2023. DOI: https://doi.org/10.32604/cmes.2022.024555

E. El-Awadly, A. Ebada, and A. Al-Zoghby, "Arabic Handwritten Text Recognition Systems and Challenges and Opportunities," The Egyptian Journal of Language Engineering, vol. 10, no. 2, pp. 84–103, Oct. 2023. DOI: https://doi.org/10.21608/ejle.2023.193993.1043

H. M. Al-Barhamtoshy, K. M. Jambi, S. M. Abdou, and M. A. Rashwan, "Arabic Documents Information Retrieval for Printed, Handwritten, and Calligraphy Image," IEEE Access, vol. 9, pp. 51242–51257, 2021. DOI: https://doi.org/10.1109/ACCESS.2021.3066477

Downloads

How to Cite

[1]
B. A. Tuama and F. Mohamed, “A Systematic Literature Review of Deep Learning Methods for Handwritten Text Recognition in Historical Arabic Manuscripts”, Eng. Technol. Appl. Sci. Res., vol. 15, no. 4, pp. 25772–25782, Aug. 2025.

Metrics

Abstract Views: 354
PDF Downloads: 319

Metrics Information