Implementation of OCR (Optical Character Recognition) Using Tesseract in Detecting Character in Quotes Text Images

Authors

  • Ikha Novie Tri Lestari STIKOM CKI JAKARTA
  • Dadang Iskandar Mulyana STIKOM Cipta Karya Informatika

DOI:

https://doi.org/10.37385/jaets.v4i1.905

Keywords:

Optical Character Recognition, Tesseract, Quotes

Abstract

The development of technology in Indonesia is currently increasingly advanced in people's lives and cannot be avoided. The use of Artificial Intelligence in helping humans in dealing with problems is growing. Humans can take advantage of computer/smartphone media in today's technological era. One of its uses is Optical Character Recognition. This research is motivated by the problem where the running system requires development in terms of technology to detect characters in the quote text image, because the previous system still performs manual input. Optical Character Recognition has been widely used to extract characters contained in digital image media. The ability of OCR methods and techniques is very dependent on the normalization process as an initial process before entering into the next stages such as segmentation and identification. The image normalization process aims to obtain a better input image so that the segmentation and identification process can produce optimal accuracy. To get maximum results, it takes several pre-processing stages on the image to be used. To achieve this, it is necessary to perform Optical Character Recognition which can be done using Tesseract-OCR. The OCR program that was created was successfully used to scan or scan a quote text image if the document was lost or damaged, and it could save time for creating, processing and typing documents.

Downloads

Download data is not yet available.

References

Andreas, Y., Gunadi, K., & Purbowo, A. N. (2020). Implementasi Tesseract OCR untuk Pembuatan Aplikasi Pengenalan Nota pada Android. Jurnal Infra, 8(1), 312-317.

Ashar, M. K., Setyawan, G. E., & Setiawan, E. (2020). Navigasi Robot Beroda Berdasarkan Pengenalan Teks untuk Melakukan Pergerakan Menggunakan Metode Optical Character Recognition ( OCR ). Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 4(4), 1150–1159.

Fauzan, M. R., & Wibowo, A. P. W. (2021). Pendeteksian Plat Nomor Kendaraan Menggunakan Algoritma You Only Look Once V3 Dan Tesseract. Jurnal Ilmiah Teknologi Infomasi Terapan, 8(1), 57–62. https://doi.org/10.33197/jitter.vol8.iss1.2021.718

Firdaus, A., Syamsu Kurnia, M., Shafera, T., Firdaus, W. I., Teknik, J., Politeknik, K., & Sriwijaya -Palembang, N. (2021). Implementasi Optical Character Recognition (OCR) Pada Masa Pandemi Covid-19 *1. In Jurnal JUPITER. 13(2).

Hamzah, M. L., Rahmadhani, R. F., & Purwati, A. A. (2022). An Integration of Webqual 4.0, Importance Performance Analysis and Customer Satisfaction Index on E-Campus. Journal of System and Management Sciences, 12(3), 25-50. https://doi.org/10.33168/JSMS.2022.0302

Kevin Wiguna, A., Suciati, N., & Khotimah, W. N. (2019). Aplikasi Penerjemah Gambar Teks Berbahasa Inggris Menggunakan Teknologi Realitas Tertambah pada Perangkat Berbasis Android. Jurnal Teknik ITS, 8(1). https://doi.org/10.12962/j23373539.v8i1.40070

Li, M., Lv, T., Cui, L., Lu, Y., Florencio, D., Zhang, C., ... & Wei, F. (2021). Trocr: Transformer-based optical character recognition with pre-trained models. arXiv preprint arXiv:2109.10282.

Mamuriyah, N., & Jacky, J. (2021). Perancangan dan Pembuatan Alat untuk Mendeteksi Teks Hangul dan Inggris pada Menu Makanan Menggunakan metode OCR (Optical Character Recognition). Telcomatics, 6(1), 1–10. https://doi.org/10.37253/telcomatics.v6i1.5054

Marizal, M. (2022). Classification of The Risk of Comorbid Covid-19 Patient at Bengkalis Hospital Using Bayesian Binary Logistics Regression . Journal of Applied Engineering and Technological Science (JAETS), 3(2), 168–177. https://doi.org/10.37385/jaets.v3i2.812

Nafsin, M., Qashlim, A., & Khairat, U. (2022, May). Sistem Informasi Data Siswa Berbasis Ocr (Optical Character Recognition) Pada Smk Bina Harapan. In Journal Peqguruang: Conference Series. 4(1), 412-417.

Nazaruddin, N. (2022). Implementation of Quality Improvements to Minimize Critical to Quality Variations in Polyurethane Liquid Injection Processes. Journal of Applied Engineering and Technological Science (JAETS), 3(2), 139–148. https://doi.org/10.37385/jaets.v3i2.771

Okta, M. D. U., Aulia, S., & Burhanuddin, B. (2021). Pengenalan Pola Berbasis OCR untuk Pengambilan Data Bursa Saham. Jurnal Rekayasa Elektrika, 17(2), 100–106. https://doi.org/10.17529/jre.v17i2.19656

Phoenix, P., Sudaryono, R., & Suhartono, D. (2021). Classifying promotion images using optical character recognition and Naïve Bayes classifier. Procedia Computer Science, 179, 498-506.

Pino, R., Mendoza, R., & Sambayan, R. (2021). Optical character recognition system for Baybayin scripts using support vector machine. PeerJ Computer Science, 7, e360.

Sahertian, J., Khotmuniza, M. I., & Helilintar, R. (2020). Sistem Parkir Menggunakan Ocr (Optical Character Recognition) Plat Nomer Dan Iot (Internet of Things). Joutica, 5(2), 363. https://doi.org/10.30736/jti.v5i2.443

Sandhika, R. (2014). KITAB FIQIH SAFINAH AN-NAJA.

Sanjaya, E., Prasetiadi, A., & SAPUTRA, W. A. (2019). Klasifikasi Analisis Sentimen Pada Gambar Meme Politik Dengan Library Tesseract Dan Algoritme Support vector machine. Journal of Informatics, Information System, Software Engineering and Applications (INISTA), 2(1), 56–64. https://doi.org/10.20895/inista.v2i1.96

Sintia, S., Defit, S., & Nurcahyo, G. W. (2021). Product Codefication Accuracy With Cosine Similarity And Weighted Term Frequency And Inverse Document Frequency (TF-IDF) . Journal of Applied Engineering and Technological Science (JAETS), 2(2), 62–69. https://doi.org/10.37385/jaets.v2i2.210

Siregar, R. (2019). Implementasi OTSU Thresholding pada Optical Character Recognition Menggunakan Engine Tesseract.

Susanto, F. A., & Beeh, Y. R. (2015). Pemanfaatan Teknologi Optical Character Recognition (OCR) Untuk Mengenali Alfabet Yunani Berbasis Android. Artikel Ilmiah Teknolog? ?nformasi. Universitas Kristen SatyaWacana. Salatiga.

Thorat, C., Bhat, A., Sawant, P., Bartakke, I., & Shirsath, S. (2022). A Detailed Review on Text Extraction Using Optical Character Recognition. ICT Analysis and Applications, 719-728.

Widja, I. B. P. (2017). Rancangan Binarisasi Citra dan Pengenalan Karakter Teks Dengan Raspberry Pi. E-Proceedings KNS&I STIKOM Bali, 766-771..

Trilaksono, M. (2008). Implementasi Optical Character Recognition (Ocr) Dengan Pendekatan Metode Struktur Menggunakan Ekstraksi Ciri Vektor Dan Region IT Telkom.

Downloads

Published

2022-09-02

How to Cite

Lestari, I. N. T., & Mulyana, D. I. (2022). Implementation of OCR (Optical Character Recognition) Using Tesseract in Detecting Character in Quotes Text Images. Journal of Applied Engineering and Technological Science (JAETS), 4(1), 58–63. https://doi.org/10.37385/jaets.v4i1.905