Indonesian Sign Language Alphabet Image Classification using Vision Transformer

Authors

  • Yoga Agustiansyah Institut Teknologi Garut
  • Dede Kurniadi Institut Teknologi Garut

DOI:

https://doi.org/10.64878/jistics.v1i1.5

Keywords:

Indonesian Sign Language, Vision Transformer, Image Classification, Deep Learning, Sign Language Recognition, Transfer Learning

Abstract

Effective communication is fundamental for social interaction, yet individuals with hearing impairments often face significant barriers. Indonesian Sign Language (BISINDO) is a vital communication tool for the deaf community in Indonesia. However, limited public understanding of BISINDO creates communication barriers, which necessitate an accurate automatic recognition system. This research aims to investigate the efficacy of the Vision Transformer (ViT) model, a state-of-the-art deep learning architecture, for classifying static BISINDO alphabet images, exploring its potential to overcome the limitations of previous approaches through robust feature extraction. The methodology involved utilizing a dataset of 26 BISINDO alphabet classes, which underwent comprehensive preprocessing, including class balancing via augmentation and image normalization. The Google/vit-base-patch16-224-in21k ViT model was adapted with a custom classification head and trained using a two-phase strategy: initial feature extraction with a frozen backbone, followed by full network fine-tuning. The fine-tuned Vision Transformer model demonstrated exceptional performance on the unseen test set, achieving an accuracy of 99.77% (95% CI: 99.55%–99.99%), precision of 99.77%, recall of 99.72%, and a weighted F1-score of 0.9977, significantly surpassing many previously reported methods. The findings compellingly confirm that the ViT model is a highly effective and robust solution for BISINDO alphabet image classification, underscoring the potential of advanced Transformer-based architectures in developing accurate assistive communication technologies to benefit the Indonesian deaf and hard-of-hearing community.

Downloads

Download data is not yet available.

References

[1] Badan Pusat Statistik, “Potret Penyandang Disabilitas di Indonesia: Hasil Long Form Sensus Penduduk 2020,” Jakarta, Dec. 2024.

[2] Kemenkes BKPK, “Laporan Survei Kesehatan Indonesia (SKI) 2023,” May 2024.

[3] R. K. Murni, P. Padlurrahman, and H. Murcahyanto, “Peran Vital Bahasa Isyarat Indonesia dalam Membangun Komunikasi dan Integrasi Sosial Anak Tuli,” Jurnal KIBASP (Kajian Bahasa, Sastra dan Pengajaran), vol. 8, no. 1, pp. 80–92, Aug. 2024, doi: https://doi.org/10.31539/kibasp.v8i1.10103.

[4] Admin, “SIBI dan Bisindo, Sama atau Beda? Mengenal Perbedaan Dua Bahasa Isyarat di Indonesia - Bentara Campus.” Accessed: May 21, 2025. [Online]. Available: https://bentaracampus.ac.id/sibi-dan-bisindo-sama-atau-beda-mengenal-perbedaan-dua-bahasa-isyarat-di-indonesia/

[5] K. Feny Sugiantari, P. Hendra Suputra, L. Joni Erawati Dewi, and G. Bakti Pratama Putra, “PENDEKATAN MLP DALAM KLASIFIKASI BAHASA ISYARAT: ANALISIS JARAK EUCLIDEAN LANDMARK TANGAN,” JIKA (Jurnal Informatika), vol. 9, no. 2, pp. 209–218, May 2025, doi: 10.31000/JIKA.V9I2.13368.

[6] R. Saputra, G. Wahyu Nyipto Wibowo, and A. Khanif Zyen, “Sistem Klasifikasi Alfabet Bahasa Isyarat Indonesia Menggunakan CNN dengan MobileNetV2 berbasis Android,” JUPITER: Jurnal Penelitian Ilmu dan Teknologi Komputer, vol. 17, no. 1, pp. 237–248, Jan. 2025, doi: 10.5281/ZENODO.14686025.

[7] F. Lanvino, A. Y. Sukhoco, and B. Maryanto, “Sistem Klasifikasi Bahasa Isyarat Indonesia (BISINDO) dengan Menggunakan Teachable Machine,” Media Informatika, vol. 23, no. 3, pp. 161–169, Dec. 2024, doi: 10.37595/mediainfo.v23i3.304.

[8] K. Kersen and W. Widhiarso, “Penerapan Metode Convolutional Neural Network dalam Klasifikasi Bahasa Isyarat,” MDP Student Conference, vol. 2, no. 1, pp. 244–249, Apr. 2023, doi: 10.35957/mdp-sc.v2i1.4221.

[9] M. Maryamah, M. A. Pratama, M. R. Erfit, N. M. Farhani, and I. A. Hartono, “Klasifikasi Abjad SIBI (Sistem Bahasa Isyarat Indonesia) menggunakan Mediapipe dengan Metode Deep Learning,” in Prosiding Seminar Nasional Sains Data, Nov. 2023, pp. 134–141. doi: 10.33005/senada.v3i1.102.

[10] N. A. Handoko, R. B. Widodo, and W. Swastika, “Penggunaan Machine Learning dalam Klasifikasi Bahasa Isyarat BISINDO Menggunakan Kamera,” in Prosiding Seminar Nasional Universitas Ma Chung (Informatika & Sistem Informasi; Bahasa dan Seni; Farmasi), Dec. 2023, pp. 11–26. Accessed: May 20, 2025. [Online]. Available: https://ocs.machung.ac.id/index.php/seminarnasionalmachung/article/view/397

[11] M. F. Naufal and S. F. Kusuma, “Analisis Perbandingan Algoritma Machine Learning dan Deep Learning untuk Klasifikasi Citra Sistem Isyarat Bahasa Indonesia (SIBI),” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 10, no. 4, pp. 873–882, Aug. 2023, doi: 10.25126/JTIIK.20241046823.

[12] M. Sholawati, K. Auliasari, and FX. Ariwibisono, “Pengembangan Aplikasi Pengenalan Bahasa Isyarat Abjad Sibi Menggunakan Metode Convolutional Neural Network (Cnn),” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 6, no. 1, pp. 134–144, Mar. 2022, doi: 10.36040/jati.v6i1.4507.

[13] A. Dosovitskiy et al., "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” ICLR 2021 - 9th International Conference on Learning Representations, Oct. 2020, Accessed: May 21, 2025. [Online]. Available: https://arxiv.org/pdf/2010.11929

[14] M. V. Koroteev, “BERT: A Review of Applications in Natural Language Processing and Understanding,” Mar. 2021, Accessed: Jun. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2103.11943

[15] T. Wolf et al., “HuggingFace’s Transformers: State-of-the-art Natural Language Processing,” Oct. 2019.

[16] I. D. Id, Machine Learning: Teori, Studi Kasus dan Implementasi Menggunakan Python, 1st ed. UR PRESS, 2021. doi: 10.5281/zenodo.5113507.

[17] J. Brownlee, “How to Report Classifier Performance with Confidence Intervals - MachineLearningMastery.com,” machinelearningmastery.com. Accessed: Jun. 13, 2025. [Online]. Available: https://machinelearningmastery.com/report-classifier-performance-confidence-intervals/

Downloads

Published

2025-06-15 — Updated on 2025-06-17

How to Cite

[1]
Y. Agustiansyah and D. Kurniadi, “Indonesian Sign Language Alphabet Image Classification using Vision Transformer”, J. Intell. Syst. Technol. Inform., vol. 1, no. 1, pp. 1–9, Jun. 2025.

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.