MSAFF: Multi-Script Adaptive Feature Fusion for Script Identification

Suneel C. Shinde; V. S. Malemath; Joshi B. Vinayak; Anilkumar C. Korishetti; R. Gogulan

doi:10.48084/etasr.11729

Authors

Suneel C. Shinde Department of CSE, KLE Dr. M. S. Sheshgiri College of Engineering and Technology (Αffiliated to VTU, Belagavi), Belagavi, India
V. S. Malemath Department of CSE, KLE Dr. M. S. Sheshgiri College of Engineering and Technology (Αffiliated to VTU, Belagavi), Belagavi, India
Joshi B. Vinayak Department of School of Engineering and Technology, Sapthagiri National Public School University, Bengaluru, India
Anilkumar C. Korishetti Department of Electronics and Communication, S. G. Balekundri Institute of Technology (Αffiliated to VTU, Belagavi), Belagavi, India
R. Gogulan School of Computational Science and IT, Garden City University, Bengaluru, India

Volume: 15 | Issue: 4 | Pages: 25341-25346 | August 2025 | https://doi.org/10.48084/etasr.11729

Received: 26 April 2025 | Revised: 15 May 2025, 28 May 2025, and 2 June 2025 | Accepted: 6 June 2025 | Online: 2 August 2025

Corresponding author: Suneel C. Shinde

Abstract

Script identification in multilingual documents remains a critical challenge in document analysis, especially in the context of Indian languages, where multiple scripts often coexist within the same document. This paper proposes MSAFF (Multi-Script Adaptive Feature Fusion), a novel framework designed to tackle this complexity by dynamically integrating multiple discriminative feature sets, namely Local Binary Patterns (LBP), Horizontal Projection Profiles (HPP), and Histogram of Oriented Gradients (HOG). MSAFF employs an adaptive fusion mechanism that intelligently adjusts feature weights according to the granularity of the input, enabling robust script recognition across various textual levels, including blocks, lines, words, numerals, and alphanumeric strings. To effectively classify scripts and manage transitions between them in mixed-script environments, MSAFF utilizes a hybrid classification strategy that combines Support Vector Machines (SVMs) for initial script identification with Hidden Markov Models (HMMs) to model sequential script transitions. Extensive evaluations on the MDIW-13 dataset demonstrate the effectiveness of MSAFF, which achieved an overall accuracy of 92%, with outstanding results in text block-level identification (96%) and mixed-script transition detection (>85%). The method also shows strong resilience to document degradations, maintaining high accuracy under noise (90%) and skew (88%) conditions. Additionally, MSAFF exhibits notable computational efficiency, outperforming state-of-the-art techniques in processing speed across varying input sizes.

Keywords:

script identification, feature fusion, document analysis, multilingual documents, Indian scripts, SVM, HMM

References

S. Chanda, S. Pal, K. Franke, and U. Pal, "Two-stage Approach for Word-wise Script Identification," in 2009 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 2009, pp. 926–930. DOI: https://doi.org/10.1109/ICDAR.2009.239

B. B. Chaudhuri and U. Pal, "An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi)," in Proceedings of the Fourth International Conference on Document Analysis and Recognition, Ulm, Germany, 1997, vol. 2, pp. 1011–1015. DOI: https://doi.org/10.1109/ICDAR.1997.620662

M. C. Padma and P. Nagabhushan, "Horizontal and Vertical linear edge features as useful clues in the discrimination of multiligual (Kannada, Hindi and English) machine printed documents," in Proc. National Workshop on Computer Vision, Graphics and Image Processing (WVGIP), 2002, pp. 204–209.

S. B. Patil and N. V. Subbareddy, "Neural network based system for script identification in Indian documents," Sadhana, vol. 27, no. 1, pp. 83–97, Feb. 2002. DOI: https://doi.org/10.1007/BF02703314

D. Dhanya and A. G. Ramakrishnan, "Script Identification in Printed Bilingual Documents," in Document Analysis Systems V, 2002, pp. 13–24. DOI: https://doi.org/10.1007/3-540-45869-7_2

D. N. L. Vu, T. Igamberdiev, and I. Habernal, "Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation." arXiv, Sep. 26, 2024.

P. Nagabhushan, S. A. Angadi, and B. S. Anami, "An intelligent pin code script identification methodology based on texture analysis using modified invariant moments," in Proceedings of International Conference on Cognition and Recognition, 2005, pp. 615–623.

M. A. Ferrer, A. Das, M. Diaz, C. Carmona-Duarte, and U. Pal, "MDIW-13 MultiScript Document Database." IEEE DataPort, Oct. 25, 2019.

F. Saleh, W. Buntine, G. Haffari, and L. Du, "Multilingual Neural Machine Translation:Can Linguistic Hierarchies Help?" arXiv, Oct. 15, 2021. DOI: https://doi.org/10.18653/v1/2021.findings-emnlp.114

Y. Liu, C. Ma, H. Ye, and H. Schütze, "TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models." arXiv, 2024. DOI: https://doi.org/10.18653/v1/2024.acl-long.136

J. Sauvola and M. Pietikäinen, "Adaptive document image binarization," Pattern Recognition, vol. 33, no. 2, pp. 225–236, Feb. 2000. DOI: https://doi.org/10.1016/S0031-3203(99)00055-2

R. O. Duda and P. E. Hart, "Use of the Hough transformation to detect lines and curves in pictures," Communications of the ACM, vol. 15, no. 1, pp. 11–15, Jan. 1972. DOI: https://doi.org/10.1145/361237.361242

P. Di et al., "CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model," in Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice, Feb. 2024, pp. 418–429. DOI: https://doi.org/10.1145/3639477.3639719