Classification of Multiple Emotions in Indonesian Text Using The K-Nearest Neighbor Method


  • Ahmad Zamsuri Universitas Lancang Kuning
  • Sarjon Defit Universitas Putra Indonesia “YPTK” Padang
  • Gunadi Widi Nurcahyo Universitas Putra Indonesia “YPTK” Padang



Emotions, TF-IDF, BoW, KNN, Data Splitting


Emotions are expressions manifested by individuals in response to what they see or experience. In this study, emotions were examined through individuals' tweets regarding the election issues in Indonesia in 2024. The collected tweets were then labeled based on emotions using the emotion wheel, which consisted of six categories: joy, love, surprise, anger, fear, and sadness. After the labeling process, the next step involved weighting using TF-IDF (Term Frequency-Inverse Document Frequency) and Bag-of-Words (BoW) techniques. Subsequently, the model was evaluated using the K-Nearest Neighbor (KNN) algorithm with three different data splitting ratios: 80:20, 70:30, and 60:40. From the six labels used in the modeling process, the accuracy was then calculated, and the labels were subsequently merged into positive and negative categories. Then the modeling was conducted using the same process with the six labels. The results of this study revealed that the utilization of TF-IDF outperformed BoW. The highest accuracy was achieved with the 80:20 data splitting ratio, attaining 58% accuracy for the six-label classification and 79% accuracy for the two-label classification


Download data is not yet available.

Author Biographies

Ahmad Zamsuri, Universitas Lancang Kuning



Sarjon Defit, Universitas Putra Indonesia “YPTK” Padang



Gunadi Widi Nurcahyo, Universitas Putra Indonesia “YPTK” Padang




Aloqaily, A., Al-hassan, M., Salah, K., Elshqeirat, B., Almashagbah, M., & Al Hussein Bin Abdullah, P. (2020). Sentiment Analysis For Arabic Tweets Datasets: Lexicon-Based And Machine Learning Approaches. Journal of Theoretical and Applied Information Technology, 29, 4.

Alturayeif, N., & Luqman, H. (2021). Fine-grained sentiment analysis of arabic covid-19 tweets using bert-based transformers and dynamically weighted loss function. Applied Sciences (Switzerland), 11(22).

Alzami, F., Udayanti, E. D., Prabowo, D. P., & Megantara, R. A. (2020). Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 235–242.

Anam, M. K., Mahendra, M. I., Agustin, W., Rahmaddeni, & Nurjayadi. (2022). Framework for Analyzing Netizen Opinions on BPJS Using Sentiment Analysis and Social Network Analysis (SNA). Intensif, 6(1), 2549–6824.

Angloher, G., Banik, S., Bartolot, D., Benato, G., Bento, A., Bertolini, A., Breier, R., Bucci, C., Burkhart, J., Canonica, L., D’Addabbo, A., Di Lorenzo, S., Einfalt, L., Erb, A., Feilitzsch, F. v., Iachellini, N. F., Fichtinger, S., Fuchs, D., Fuss, A., … Waltenberger, W. (2023). Towards an automated data cleaning with deep learning in CRESST. European Physical Journal Plus, 138(1).

Atika Sari, C., & Hari Rachmawanto, E. (2022). Sentiment Analyst on Twitter Using the K-Nearest Neighbors (KNN) Algorithm Against Covid-19 Vaccination. Journal of Applied Intelligent System, 7(2), 135–145.

Awan, S. E., Bennamoun, M., Sohel, F., Sanfilippo, F. M., Chow, B. J., & Dwivedi, G. (2018). Feature selection and transformation by machine learning reduce variable numbers and improve prediction for heart failure readmission or death. PLoS ONE, 14(6).

Chenna, A., Srinivas, B., & Nagaraju, S. (2021). Emotion And Sentiment Analysis From Twitter Text. Turkish Journal of Computer and Mathematics Education, 12(12), 4614–4620.

Fauzi, M. A. (2019). Word2Vec model for sentiment analysis of product reviews in Indonesian language. International Journal of Electrical and Computer Engineering (IJECE), 9(1), 525.

Fernandes.J, B., Bhargavi, Ch., Arshad, S., Kumar, S., & Sandeep, G. (2020). Emotion recognition in speech signals using optimization based multi-SVNN classifier. International Journal Of Scientific & Technology Research, 9(1), 3998–4001.

Friedman, R. (2023). Tokenization in the Theory of Knowledge. Encyclopedia, 3(1), 380–386.

Graciyal, D. G., & Viswam, D. (2021). Social Media and Emotional Well-being: Pursuit of Happiness or Pleasure. Asia Pacific Media Educator, 31(1), 99–115.

Gu, S., Wang, F., Patel, N. P., Bourgeois, J. A., & Huang, J. H. (2019). A model for basic emotions using observations of behavior in Drosophila. In Frontiers in Psychology (Vol. 10, Issue APR). Frontiers Media S.A.

Gunawan, L., Anggreainy, M. S., Wihan, L., Santy, Lesmana, G. Y., & Yusuf, S. (2022). Support vector machine based emotional analysis of restaurant reviews. International Conference on Computer Science and Computational Intelligence, 216, 479–484.

Hustinawaty, Dwiputra, R. A. A., & Rumambi, T. (2019). Public Sentiment Analysis Of Pasar Lama Tangerang Using K-Nearest Neighbor Method And Programming Language R. Jurnal Ilmiah Informatika Komputer, 24(2), 129–133.

Iglesias, C. A., & Moreno, A. (2020). Sentiment Analysis for social media. In Applied Science (Special Issue).

Irfan, M. R., Fauzi, M. A., Tibyani, T., & Mentari, N. D. (2018). Twitter Sentiment Analysis on 2013 Curriculum Using Ensemble Features and K-Nearest Neighbor. International Journal of Electrical and Computer Engineering (IJECE), 8(6), 5409.

Juluru, K., Shih, H. H., Murthy, K. N. K., & Elnajjar, P. (2021). Bag-of-words technique in natural language processing: A primer for radiologists. Radiographics, 41(5), 1420–1426.

Junadhi, Agustin, Rifqi, M., & Anam, M. K. (2022). Sentiment Analysis of Online Lectures using K-Nearest Neighbors based on Feature Selection. Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), 11(3), 216–225.

Kang, H., Yoo, S. J., & Han, D. (2012). Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Systems with Applications, 39(5), 6000–6010.

Kaur, R., & Bhardwaj, V. (2019). Gurmukhi Text Emotion Classification System using TF-IDF and N-gram Feature Set Reduced using APSO. International Journal on Emerging Technologies, 10(3), 352–362.

Khattak, A., Asghar, M. Z., Ishaq, Z., Bangyal, W. H., & Hameed, I. A. (2021). Enhanced concept-level sentiment analysis system with expanded ontological relations for efficient classification of user reviews. Egyptian Informatics Journal, 22(4), 455–471.

Kiran Kumar, P., & Kumar, I. (2021). Emotion detection and sentiment analysis of text. Proceedings of the International Conference on Innovative Computing & Communication (ICICC), 1–4.

Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. Information, 1(68).

Kusumawati, N., Maspupah, U., Sari, D. R., Hamzah, A., Lukito, D., & Dwi Saputra, D. (2022). based on Surat Keputusan Dirjen Risbang SK Nomor 85/M/KPT/2020 Comparing Algorithm For Sentiment Analysis In Healthcare And Social Security Agency (BPJS Kesehatan). Techno Nusa Mandiri?: Journal of Computing and Information Technology As an Accredited Journal Rank, 19(1).

Madhavan, M. V., Pande, S., Umekar, P., Mahore, T., & Kalyankar, D. (2021). Comparative analysis of detection of email spam with the aid of machine learning approaches. IOP Conference Series: Materials Science and Engineering, 1022(1).

Menaouer, B., Zahra, A. F., & Mohammed, S. (2022). Multi-Class Sentiment Classification for Healthcare Tweets Using Supervised Learning Techniques. International Journal of Service Science, Management, Engineering, and Technology, 13(1), 1–23.

Mujahid, M., Lee, E., Rustam, F., Washington, P. B., Ullah, S., Reshi, A. A., & Ashraf, I. (2021). Sentiment analysis and topic modeling on tweets about online education during covid-19. Applied Sciences (Switzerland), 11(18).

Nugroho, K. S., Bachtiar, F. A., & Mahmudy, W. F. (2022). Detecting Emotion in Indonesian Tweets: A Term-Weighting Scheme Study. Journal of Information Systems Engineering and Business Intelligence, 8(1), 61–70.

Pace-Schott, E. F., Amole, M. C., Aue, T., Balconi, M., Bylsma, L. M., Critchley, H., Demaree, H. A., Friedman, B. H., Gooding, A. E. K., Gosseries, O., Jovanovic, T., Kirby, L. A. J., Kozlowska, K., Laureys, S., Lowe, L., Magee, K., Marin, M. F., Merner, A. R., Robinson, J. L., … VanElzakker, M. B. (2019). Physiological feelings. Neuroscience and Biobehavioral Reviews, 103, 267–304.

Pamuji, A. (2021). Performance of the K-Nearest Neighbors Method on Analysis of Social Media Sentiment. Juisi, 07(01), 32–37.

Pandian, M. N. R., & Balasubramani, M. (2020). An Efficient Hybrid Classification Algorithm For Heart Prediction In Data Mininig. European Journal of Molecular & Clinical Medicine, 7(4), 1946–1954.

Putra, R. S., Agustin, W., Anam, M. K., Lusiana, L., & Yaakub, S. (2022). The Application of Naïve Bayes Classifier Based Feature Selection on Analysis of Online Learning Sentiment in Online Media. Jurnal Transformatika, 20(1), 44.

Ramdani, C. M. S., Rachman, A. N., & Setiawan, R. (2022). Comparison of the Multinomial Naive Bayes Algorithm and Decision Tree with the Application of AdaBoost in Sentiment Analysis Reviews PeduliLindungi Application. International Journal of Information System & Technology Akreditasi, 6(4), 419–430.

Rani, S., & Singh Gill, N. (2020). Hybrid Model For Twitter Data Sentiment Analysis Based On Ensemble Of Dictionary Based Classifier And Stacked Machine Learning Classifiers-SVM, KNN AND C5.0. Journal of Theoretical and Applied Information Technology, 29, 4.

Rifai, W., & Winarko, E. (2019). Modification of Stemming Algorithm Using A Non Deterministic Approach To Indonesian Text. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 13(4), 379.

Ritha, N., Hayaty, N., Matulatan, T., Uperiati, A., Rathomi, M., Bettiza, M., & Farasalsabila, F. (2023). Sentiment Analysis of Health Protocol Policy Using K-Nearest Neighbor and Cosine Similarity. ICSEDTI, 1–9.

Saifullah, S., Fauziyah, Y., & Aribowo, A. S. (2021). Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data. Jurnal Informatika, 15(1), 45.

Sailunaz, K., & Alhajj, R. (2019). Emotion and sentiment analysis from Twitter text. Journal of Computational Science, 36, 1–18.

Sajib, M. I., Shargo, S. M., & Hossain, Md. A. (2019). Comparison of the efficiency of Machine Learning algorithms on Twitter Sentiment Analysis of Pathao. International Conference on Computer and Information Technology, 1–6.

Samsir, Irmayani, D., Edi, F., Harahap, J. M., Jupriaman, Rangkuti, R. K., Ulya, B., & Watrianthos, R. (2021). Naives Bayes Algorithm for Twitter Sentiment Analysis. Journal of Physics: Conference Series, 1933(1).

Santhosh Baboo, S., & Amirthapriya, M. (2022). Sentiment Analysis And Automatic Emotion Detection Analysis Of Twitter Using Machine Learning Classifiers. International Journal of Mechanical Engineering, 7(2), 1161–1171.

Sarimole, F. M., & Rosiana, A. (2022). Classification of Maturity Levels in Areca Fruit Based on HSV Image Using the KNN Method. Journal of Applied Engineering and Technological Science (JAETS), 4(1), 64–73.

Satyanarayana, K., Shankar, D., & Raju, D. (2021). An Approach For Finding Emotions Using Seed Dataset With Knn Classifier. In Turkish Journal of Computer and Mathematics Education (Vol. 12, Issue 10).

Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1).

Jaya, R. T., & Wahyudi, T. (2022). Classification of Booster Vaccination Symptoms Using Naive Bayes Algorithm and C4.5. Journal of Applied Engineering and Technological Science (JAETS), 4(1), 131–138.

Wang, Y., Zhang, Y., Lu, Y., & Yu, X. (2020). A Comparative Assessment of Credit Risk Model Based on Machine Learning - a Case Study of Bank Loan Data. Procedia Computer Science, 174, 141–149.




How to Cite

Zamsuri, A., Defit, S., & Nurcahyo, G. W. (2023). Classification of Multiple Emotions in Indonesian Text Using The K-Nearest Neighbor Method. Journal of Applied Engineering and Technological Science (JAETS), 4(2), 1012–1021.