From Local Features to Global Context: Comparing CNN and Transformer for Sundanese Script Classification

Authors

  • Yoga Agustiansyah Institut Teknologi Garut
  • Dhika Restu Fauzi Institut Teknologi Garut

DOI:

https://doi.org/10.64878/jistics.v1i2.38

Keywords:

Aksara Sunda, Convolutional Neural Network, Deep Learning, Transfer Learning, Vision Transformer

Abstract

The digital preservation of historical writing systems like Aksara Sunda is critical for cultural heritage, yet automated recognition is hindered by high character similarity and handwriting variability. This study systematically compares two dominant deep learning paradigms, Convolutional Neural Networks (CNNs) and Transformers, to evaluate the crucial trade-off between model accuracy and real-world robustness. Using a transfer learning approach, we trained five models (ResNet50, MobileNetV2, EfficientNetB0, ViT, and DeiT) on a balanced 30-class dataset of Sundanese script. Performance was assessed on a standard in-distribution test set and a challenging, independently collected Out-of-Distribution (OOD) dataset designed to simulate varied real-world conditions. The results reveal a significant performance inversion. While EfficientNetB0 achieved the highest accuracy of 96.9% on in-distribution data, its performance plummeted on the OOD set. Conversely, ResNet50, despite being lower in in-distribution accuracy, proved to be the most robust model, achieving the highest accuracy of 92.5% on the OOD data. This study concludes that for practical applications requiring reliable performance, the generalization capability demonstrated by ResNet50 is more valuable than the specialized accuracy of EfficientNetB0, offering a crucial insight for developing robust digital preservation tools for historical scripts.

Downloads

Download data is not yet available.

References

M. I. Ilham and Y. Asriningtias, “Aplikasi Mobile Augmented Reality untuk Mendukung Pengenalan Aksara Sunda,” Edumatic: Jurnal Pendidikan Informatika, vol. 7, no. 2, pp. 426–434, Dec. 2023, doi: 10.29408/edumatic.v7i2.23602.

E. Yuliyanti, “Teknologi, Budaya, dan Digitalisasi: Menghidupkan Aksara Nusantara dalam Ruang Digital - Dinas Komunikasi, Informatika dan Statistik Kota Cirebon,” dkis.cirebonkota.go.id. Accessed: Jun. 28, 2025. [Online]. Available: https://dkis.cirebonkota.go.id/teknologi-budaya-dan-digitalisasi-menghidupkan-aksara-nusantara-dalam-ruang-digital/

Chrismonica, “Belajar Aksara Sunda: Sejarah, Jenis dan Contohnya! | Orami,” www.orami.co.id. Accessed: Jun. 28, 2025. [Online]. Available: https://www.orami.co.id/magazine/aksara-sunda

J. Li et al., “A comprehensive survey of oracle character recognition: challenges, benchmarks, and beyond,” Nov. 2024, [Online]. Available: http://arxiv.org/abs/2411.11354

E. M. Puckett, “Optical character recognition helps unlock history | Virginia Tech News | Virginia Tech,” news.vt.edu. Accessed: Jun. 28, 2025. [Online]. Available: https://news.vt.edu/articles/2024/03/univlib-ocr.html

S. N. Rahmawati, E. W. Hidayat, and H. Mubarok, “Implementasi Deep Learning pada Pengenalan Aksara Sunda Menggunakan Metode Convolutional Neural Network,” INSERT: Information System and Emerging Technology Journal, vol. 2, no. 1, pp. 46–58, Jun. 2021, doi: 10.23887/insert.v2i1.37405.

M. F. Naufal, J. Siswantoro, and J. T. Soebroto, “Transliterating Javanese Script Images to Roman Script using Convolutional Neural Network with Transfer Learning,” JOIV: International Journal on Informatics Visualization, vol. 8, no. 3, p. 1460, Sep. 2024, doi: 10.62527/joiv.8.3.2566.

A. A. Pratama, M. D. Sulistiyo, and A. F. Ihsan, “Balinese Script Handwriting Recognition Using Faster R-CNN,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 7, no. 6, pp. 1268–1275, Nov. 2023, doi: 10.29207/resti.v7i6.5176.

Y. Agustiansyah and D. Kurniadi, “Indonesian Sign Language Alphabet Image Classification using Vision Transformer,” Journal of Intelligent Systems Technology and Informatics, vol. 1, no. 1, pp. 1–9, Jun. 2025, Accessed: Jun. 29, 2025. [Online]. Available: https://journal.aptika.org/index.php/jistics/article/view/5

C. Boufenar, M. A. Rabiai, B. N. Zahaf, and K. R. Ouaras, “Bridging the Gap: Fusing CNNs and Transformers to Decode the Elegance of Handwritten Arabic Script,” Mar. 2025.

H. Alaeddine and M. Jihene, “Deep Residual Network in Network,” Comput Intell Neurosci, vol. 2021, no. 1, Jan. 2021, doi: 10.1155/2021/6659083.

A. R. Hermanto, A. Aziz, and S. Sudianto, “Perbandingan Arsitektur MobileNetV2 dan RestNet50 untuk Klasifikasi Jenis Buah Kurma,” JUSTIN (Jurnal Sistem dan Teknologi Informasi), vol. 12, no. 4, pp. 630–637, Nov. 2024, doi: 10.26418/JUSTIN.V12I4.80358.

M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” Sep. 2020, Accessed: Jul. 17, 2025. [Online]. Available: https://arxiv.org/abs/1905.11946

A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” ICLR 2021 - 9th International Conference on Learning Representations, Oct. 2020, Accessed: May 21, 2025. [Online]. Available: https://arxiv.org/pdf/2010.11929

R. Grainger, T. Paniagua, X. Song, N. Cuntoor, M. W. Lee, and T. Wu, “PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers,” Apr. 2023.

K. Islam, “Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work,” Oct. 2023.

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” Jan. 2021, Accessed: Jul. 17, 2025. [Online]. Available: https://arxiv.org/abs/2012.12877

D. R. Fauzi and G. A. H. D, “Comparison of CNN Models Using EfficientNetB0, MobileNetV2, and ResNet50 for Traffic Density with Transfer Learning,” Journal of Intelligent Systems Technology and Informatics, vol. 1, no. 1, pp. 22–30, Jun. 2025, Accessed: Jun. 29, 2025. [Online]. Available: https://journal.aptika.org/index.php/jistics/article/view/6

I. D. Id, MACHINE LEARNING: Teori, Studi Kasus dan Implementasi Menggunakan Python, 1st ed. UR PRESS, 2021. doi: 10.5281/zenodo.5113507.

A. D. Ramdani, “Aksara Sunda,” www.kaggle.com. Accessed: Jun. 29, 2025. [Online]. Available: https://www.kaggle.com/datasets/abdidwiramdani/aksara-sunda

Aksara Sunda Dataset, “Aksara Sunda Computer Vision Project,” universe.roboflow.com. Accessed: Jun. 29, 2025. [Online]. Available: https://universe.roboflow.com/aksarasunda/aksara-sunda-eayhq

L. Abdiansah, S. Sumarno, A. Eviyanti, and N. L. Azizah, “Penerapan Algoritma Convolutional Neural Networks untuk Pengenalan Tulisan Tangan Aksara Jawa,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 5, no. 2, pp. 496–504, Mar. 2025, doi: 10.57152/malcom.v5i2.1814.

F. Bougourzi, F. Dornaika, and C. Zhang, “Extremely Fine-Grained Visual Classification over Resembling Glyphs in the Wild,” Aug. 2024, Accessed: Jul. 17, 2025. [Online]. Available: https://arxiv.org/abs/2408.13774

D. Pant, D. Talukder, D. Kumar, R. Pandey, A. Seth, and C. Arora, “Use of Metric Learning for the Recognition of Handwritten Digits, and its Application to Increase the Outreach of Voice-based Communication Platforms,” Apr. 2025, Accessed: Jul. 17, 2025. [Online]. Available: https://arxiv.org/abs/2504.18948

Downloads

Published

2025-09-14

How to Cite

[1]
Y. Agustiansyah and D. R. Fauzi, “From Local Features to Global Context: Comparing CNN and Transformer for Sundanese Script Classification”, J. Intell. Syst. Technol. Inform., vol. 1, no. 2, pp. 53–61, Sep. 2025.

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.