Implementation of OCR (Optical Character Recognition) Using Tesseract in Detecting Character in Quotes Text Images
DOI:
https://doi.org/10.37385/jaets.v4i1.905Keywords:
Optical Character Recognition, Tesseract, QuotesAbstract
The development of technology in Indonesia is currently increasingly advanced in people's lives and cannot be avoided. The use of Artificial Intelligence in helping humans in dealing with problems is growing. Humans can take advantage of computer/smartphone media in today's technological era. One of its uses is Optical Character Recognition. This research is motivated by the problem where the running system requires development in terms of technology to detect characters in the quote text image, because the previous system still performs manual input. Optical Character Recognition has been widely used to extract characters contained in digital image media. The ability of OCR methods and techniques is very dependent on the normalization process as an initial process before entering into the next stages such as segmentation and identification. The image normalization process aims to obtain a better input image so that the segmentation and identification process can produce optimal accuracy. To get maximum results, it takes several pre-processing stages on the image to be used. To achieve this, it is necessary to perform Optical Character Recognition which can be done using Tesseract-OCR. The OCR program that was created was successfully used to scan or scan a quote text image if the document was lost or damaged, and it could save time for creating, processing and typing documents.
Downloads
References
Andreas, Y., Gunadi, K., & Purbowo, A. N. (2020). Implementasi Tesseract OCR untuk Pembuatan Aplikasi Pengenalan Nota pada Android. Jurnal Infra, 8(1), 312-317.
Ashar, M. K., Setyawan, G. E., & Setiawan, E. (2020). Navigasi Robot Beroda Berdasarkan Pengenalan Teks untuk Melakukan Pergerakan Menggunakan Metode Optical Character Recognition ( OCR ). Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 4(4), 1150–1159.
Fauzan, M. R., & Wibowo, A. P. W. (2021). Pendeteksian Plat Nomor Kendaraan Menggunakan Algoritma You Only Look Once V3 Dan Tesseract. Jurnal Ilmiah Teknologi Infomasi Terapan, 8(1), 57–62. https://doi.org/10.33197/jitter.vol8.iss1.2021.718
Firdaus, A., Syamsu Kurnia, M., Shafera, T., Firdaus, W. I., Teknik, J., Politeknik, K., & Sriwijaya -Palembang, N. (2021). Implementasi Optical Character Recognition (OCR) Pada Masa Pandemi Covid-19 *1. In Jurnal JUPITER. 13(2).
Hamzah, M. L., Rahmadhani, R. F., & Purwati, A. A. (2022). An Integration of Webqual 4.0, Importance Performance Analysis and Customer Satisfaction Index on E-Campus. Journal of System and Management Sciences, 12(3), 25-50. https://doi.org/10.33168/JSMS.2022.0302
Kevin Wiguna, A., Suciati, N., & Khotimah, W. N. (2019). Aplikasi Penerjemah Gambar Teks Berbahasa Inggris Menggunakan Teknologi Realitas Tertambah pada Perangkat Berbasis Android. Jurnal Teknik ITS, 8(1). https://doi.org/10.12962/j23373539.v8i1.40070
Li, M., Lv, T., Cui, L., Lu, Y., Florencio, D., Zhang, C., ... & Wei, F. (2021). Trocr: Transformer-based optical character recognition with pre-trained models. arXiv preprint arXiv:2109.10282.
Mamuriyah, N., & Jacky, J. (2021). Perancangan dan Pembuatan Alat untuk Mendeteksi Teks Hangul dan Inggris pada Menu Makanan Menggunakan metode OCR (Optical Character Recognition). Telcomatics, 6(1), 1–10. https://doi.org/10.37253/telcomatics.v6i1.5054
Marizal, M. (2022). Classification of The Risk of Comorbid Covid-19 Patient at Bengkalis Hospital Using Bayesian Binary Logistics Regression . Journal of Applied Engineering and Technological Science (JAETS), 3(2), 168–177. https://doi.org/10.37385/jaets.v3i2.812
Nafsin, M., Qashlim, A., & Khairat, U. (2022, May). Sistem Informasi Data Siswa Berbasis Ocr (Optical Character Recognition) Pada Smk Bina Harapan. In Journal Peqguruang: Conference Series. 4(1), 412-417.
Nazaruddin, N. (2022). Implementation of Quality Improvements to Minimize Critical to Quality Variations in Polyurethane Liquid Injection Processes. Journal of Applied Engineering and Technological Science (JAETS), 3(2), 139–148. https://doi.org/10.37385/jaets.v3i2.771
Okta, M. D. U., Aulia, S., & Burhanuddin, B. (2021). Pengenalan Pola Berbasis OCR untuk Pengambilan Data Bursa Saham. Jurnal Rekayasa Elektrika, 17(2), 100–106. https://doi.org/10.17529/jre.v17i2.19656
Phoenix, P., Sudaryono, R., & Suhartono, D. (2021). Classifying promotion images using optical character recognition and Naïve Bayes classifier. Procedia Computer Science, 179, 498-506.
Pino, R., Mendoza, R., & Sambayan, R. (2021). Optical character recognition system for Baybayin scripts using support vector machine. PeerJ Computer Science, 7, e360.
Sahertian, J., Khotmuniza, M. I., & Helilintar, R. (2020). Sistem Parkir Menggunakan Ocr (Optical Character Recognition) Plat Nomer Dan Iot (Internet of Things). Joutica, 5(2), 363. https://doi.org/10.30736/jti.v5i2.443
Sandhika, R. (2014). KITAB FIQIH SAFINAH AN-NAJA.
Sanjaya, E., Prasetiadi, A., & SAPUTRA, W. A. (2019). Klasifikasi Analisis Sentimen Pada Gambar Meme Politik Dengan Library Tesseract Dan Algoritme Support vector machine. Journal of Informatics, Information System, Software Engineering and Applications (INISTA), 2(1), 56–64. https://doi.org/10.20895/inista.v2i1.96
Sintia, S., Defit, S., & Nurcahyo, G. W. (2021). Product Codefication Accuracy With Cosine Similarity And Weighted Term Frequency And Inverse Document Frequency (TF-IDF) . Journal of Applied Engineering and Technological Science (JAETS), 2(2), 62–69. https://doi.org/10.37385/jaets.v2i2.210
Siregar, R. (2019). Implementasi OTSU Thresholding pada Optical Character Recognition Menggunakan Engine Tesseract.
Susanto, F. A., & Beeh, Y. R. (2015). Pemanfaatan Teknologi Optical Character Recognition (OCR) Untuk Mengenali Alfabet Yunani Berbasis Android. Artikel Ilmiah Teknolog? ?nformasi. Universitas Kristen SatyaWacana. Salatiga.
Thorat, C., Bhat, A., Sawant, P., Bartakke, I., & Shirsath, S. (2022). A Detailed Review on Text Extraction Using Optical Character Recognition. ICT Analysis and Applications, 719-728.
Widja, I. B. P. (2017). Rancangan Binarisasi Citra dan Pengenalan Karakter Teks Dengan Raspberry Pi. E-Proceedings KNS&I STIKOM Bali, 766-771..
Trilaksono, M. (2008). Implementasi Optical Character Recognition (Ocr) Dengan Pendekatan Metode Struktur Menggunakan Ekstraksi Ciri Vektor Dan Region IT Telkom.