Riset Jurnal Literatur : Penggunaan Metode Stemming Pada Bahasa Daerah Melayu-Ambon

DOI: https://doi.org/10.33650/jeecom.v6i1.8517

Authors (s)


(1) * Vinnesa Patricia Carolina   (Universitas AMIKOM Yogyakarta)  
        Indonesia
(2)  Ema Utami   (Universitas AMIKOM Yogyakarta)  
        Indonesia
(3)  Ainul Yaqin   (Universitas AMIKOM Yogyakarta)  
        Indonesia
(*) Corresponding Author

Abstract


Stemming dalam bahasa Ambon merupakan tantangan yang signifikan karena lexiconnya yang luas, mencakup sekitar 127.000 kata dasar seperti yang tercatat dalam Kamus Besar Bahasa Indonesia. Hal ini disebabkan oleh kompleksitas stemming yang timbul dari tugas untuk mengekstrak kata-kata dasar dari kata-kata yang memiliki imbuhan, yang memerlukan penghapusan berbagai imbuhan seperti awalan, sisipan, akhiran, dan kombinasinya. Proses ini memiliki pentingan yang besar karena sangat memengaruhi kualitas hasil analisis.

Untuk mengatasi kompleksitas linguistik ini, beberapa algoritma stemming telah dikembangkan. Algoritma-algoritma ini termasuk Nazief & Adriani, Enhanced Confix Stripping, Sastrawi, dan Tala, masing-masing menawarkan teknik unik untuk menangani kompleksitas stemming dalam bahasa Indonesia. Pemilihan algoritma yang tepat sangat penting untuk memastikan akurasi dan kehandalan proses stemming dalam kerangka analisis.

Dalam penelitian stemming yang telah dilakukan, terdapat variasi dalam metode-metode yang digunakan. Algoritma stemming yang paling sering digunakan adalah Nazief & Adriani, dengan 17 kasus tercatat. Kemudian, Enhanced Confix Stripping juga cukup populer dengan 12 kasus. Sastrawi, meskipun dengan frekuensi yang lebih rendah, tetap digunakan dalam 4 kasus. Sedangkan algoritma Tala, meskipun jarang digunakan, tetap muncul dalam 1 kasus. Hal ini mencerminkan diversitas dan pilihan yang tersedia dalam memilih metode stemming yang sesuai dengan kebutuhan penelitian. Meskipun demikian, hal ini mungkin terkait dengan faktor-faktor seperti proyek penelitian yang sedang berlangsung, ketersediaan dana, atau kondisi eksternal lainnya yang memengaruhi produksi penelitian pada periode tersebut. Dengan demikian, penelitian tentang stemming tetap menjadi topik yang menarik dan relevan, dengan potensi untuk terus berkembang dan memberikan kontribusi yang berarti dalam pemrosesan teks dan penelitian linguistik di masa mendatang.


Keywords

Stemming Ambon;Nazief & Adriani;Enhanced Confix Stripping;Sastrawi;Tala



Full Text: PDF



References


Theresia Meturan, Laraswati Laraswati, and Lusi Nur Triani, “Bahasa Ambon dan Bahasa Indonesia: Analisis Fonologi,” Sintaksis, vol. 1, no. 5, pp. 54–64, Sep. 2023, doi: 10.61132/sintaksis.v1i5.261.

L. F. Pesiwarissa, “CIGULU-CIGULU (TEKA-TEKI) MASYARAKAT TUTUR BAHASA MELAYU AMBON (KAJIAN ETNOSEMANTIK: SUATU PENDEKATAN AWAL),” kolita, vol. 21, no. 21, pp. 208–214, Oct. 2023, doi: 10.25170/kolita.21.4851.

S. H. Wibowo, R. Toyib, M. Muntahanah, and Y. Darnita, “Time complexity in rejang language stemming,” J.INFOTEL, vol. 14, no. 3, pp. 174–179, Aug. 2022, doi: 10.20895/infotel.v14i3.764.

S. Tuhpatussania, E. Utami, and A. D. Hartanto, “COMPARISON OF PORTERS STEMMING ALGORITHM AND NAZIEF & ADRIANI’S STEMMING ALGORITHM IN DETERMINING INDONESIAN LANGUAGE LEARNING MODULES,” pilar, vol. 18, no. 2, pp. 203–210, Sep. 2022, doi: 10.33480/pilar.v18i2.3940.

A. Sinaga and S. P. Nainggolan, “ANALISIS PERBANDINGAN AKURASI DAN WAKTU PROSES ALGORITMA STEMMING ARIFIN-SETIONO DAN NAZIEF-ADRIANI PADA DOKUMEN TEKS BAHASA INDONESIA,” Sebatik, vol. 27, no. 1, pp. 63–69, Jun. 2023, doi: 10.46984/sebatik.v27i1.2072.

R. Sovia, S. Defit, and Yuhandri, “Development of the Minangkabau Local Language Translation Machine Based on Stemming,” in 2022 International Symposium on Information Technology and Digital Innovation (ISITDI), Padang, Indonesia: IEEE, Jul. 2022, pp. 195–198. doi: 10.1109/ISITDI55734.2022.9944457.

J. Jumadi, D. S. Maylawati, L. D. Pratiwi, and M. A. Ramdhani, “Comparison of Nazief-Adriani and Paice-Husk algorithm for Indonesian text stemming process,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 1098, no. 3, p. 032044, Mar. 2021, doi: 10.1088/1757-899X/1098/3/032044.

N. Pamungkas et al., “Comparison of Stemming Test Results of Tala Algorithms with Nazief Adriani in Abstract Documents and National News,” Inf. J. Ilm. Bid. Teknol. Inf. dan Komun., vol. 8, no. 1, pp. 33–41, Jan. 2023, doi: 10.25139/inform.v8i1.5569.

M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Gozali, “Improving Text Preprocessing For Student Complaint Document Classification Using Sastrawi,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 874, no. 1, p. 012017, Jun. 2020, doi: 10.1088/1757-899X/874/1/012017.

A. Jauhari, I. O. Suzanti, Y. D. Pramudita, Husni, and N. P. W. Diantisari, “Enhanced Confix Stripping Stemmer And Cosine Similarity For Search Engine in The Holy Qur’an Translation,” in 2020 6th Information Technology International Seminar (ITIS), Surabaya, Indonesia: IEEE, Oct. 2020, pp. 207–212. doi: 10.1109/ITIS50118.2020.9321041.

S. I. Melia, J. Sholihah, D. Nisak, I. S. Juniaristha, and A. T. Ni’mah, “The Ngoko Javanese Stemmer uses the Enhanced Confix Stripping Stemmer Method,” Rekayasa, vol. 16, no. 1, pp. 107–112, Apr. 2023, doi: 10.21107/rekayasa.v16i1.19308.

E. Lindrawati, E. Utami, and A. Yaqin, “Comparison of Modified Nazief&Adriani and Modified Enhanced Confix Stripping algorithms for Madurese Language Stemming,” intensif, vol. 7, no. 2, pp. 276–289, Aug. 2023, doi: 10.29407/intensif.v7i2.20103.

G. N. M. Nata, “Pengembangan Algoritma Stemmer Bilingual Bali-Indonesia Dengan Rule-Base,” 2023.

M. Wahyu Ade Saputra, E. Utami, and A. Yaqin, “Unlocking Insights: A Literature Review on Enhanced Confix Stripping and Nazief & Adriani Algorithm Modifications for Makassar Language Text Stemming,” International Journal of Innovative Science and Research Technology (IJISRT), pp. 603–610, Mar. 2024, doi: 10.38124/ijisrt/IJISRT24MAR437.

Y. Karuniawati, E. Utami, and A. Yaqin, “A Systematic Literature Review of Stemming in Non-Formal Indonesian Language,” vol. 8, no. 1, 2023.

Prema Adhitya Dharma Kusumah, Kusrini Kusrini, and Kusnawi Kusnawi, “Optimizing Data Security: A Literature Review on the Implementation of Beaufort Cipher for Vigenère Affine Cipher,” Feb. 2024, doi: 10.5281/ZENODO.10685974.

M. D. A. Fahreza, A. Luthfiarta, M. Rafid, and M. Indrawan, “Analisis Sentimen: Pengaruh Jam Kerja Terhadap Kesehatan Mental Generasi Z,” J. Appl. Comput. Sci. Technol., vol. 5, no. 1, pp. 16–25, Feb. 2024, doi: 10.52158/jacost.v5i1.715.

L. Cahyaningrum, A. Luthfiarta, and M. Rahayu, “Sentiment Analysis on the Impact of MBKM on Student Organizations Using Supervised Learning with Smote to Handle Data Imbalance,” 2024.

E. Lindrawati, E. Utami, and A. Yaqin, “ANoM STEMMER: Nazief & Andriani Modification for Madurese Stemming,” J. RESTI (Rekayasa Sist. Teknol. Inf.), vol. 7, no. 6, pp. 1341–1347, Dec. 2023, doi: 10.29207/resti.v7i6.5086.

S. A. H. Bahtiar, C. K. Dewa, and A. Luthfi, “Comparison of Naïve Bayes and Logistic Regression in Sentiment Analysis on Marketplace Reviews Using Rating-Based Labeling,” J. Inf. Syst. Informatics, vol. 5, no. 3, pp. 915–927, Aug. 2023, doi: 10.51519/journalisi.v5i3.539.

C. S. K. Aditya and F. D. S. Sumadi, “Combination of term weighting with class distribution and centroid- based approach for document classification,” 2023.

A. Yaman, B. Sartono, A. Indrawati, Y. A. Kartika, and A. M. Soleh, “Automated Multi Label Classification on Fertilizer Themed Patent Documents in Indonesia,” DESIDOC Jl. Lib. Info. Technol., vol. 42, no. 4, pp. 218–226, Jul. 2022, doi: 10.14429/djlit.42.4.17733.

I. O. Suzanti and A. Jauhari, “COMPARISON OF STEMMING AND SIMILARITY ALGORITHMS IN INDONESIAN TRANSLATED AL-QUR’AN TEXT SEARCH,” kursor, vol. 11, no. 2, p. 91, Jan. 2022, doi: 10.21107/kursor.v11i2.280.

S. Suyanto, A. Sunyoto, R. N. Ismail, E. Rachmawati, and W. Maharani, “Stemmer and phonotactic rules to improve n-gram tagger-based indonesian phonemicization,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 6, pp. 3807–3814, Jun. 2022, doi: 10.1016/j.jksuci.2021.01.006.

S. I. G. Situmeang, “Impact of Text Preprocessing on Named Entity Recognition Based on Conditional Random Field in Indonesian Text,” vol. 6, no. 36, 2022.

T. H. Jaya Hidayat, Y. Ruldeviyani, A. R. Aditama, G. R. Madya, A. W. Nugraha, and M. W. Adisaputra, “Sentiment analysis of twitter data related to Rinca Island development using Doc2Vec and SVM and logistic regression as classifier,” Procedia Computer Science, vol. 197, pp. 660–667, 2022, doi: 10.1016/j.procs.2021.12.187.

H. Dwiharyono and S. Suyanto, “Stemming for Better Indonesian Text-to-Phoneme,” Ampersand, vol. 9, p. 100083, 2022, doi: 10.1016/j.amper.2022.100083.

R. Tjut Adek, R. Kesuma Dinata, and A. Ditha, “Online Newspaper Clustering in Aceh using the Agglomerative Hierarchical Clustering Method,” Int. J. Eng. Scie. and Inform. Technology., vol. 2, no. 1, pp. 70–75, Nov. 2021, doi: 10.52088/ijesty.v2i1.206.

Rika Rosnelly, Dedy Hartama, Muhammad Sadikin, and Cindy Paramitha Lubis, “The Similarity of Essay Examination Results using Preprocessing Text Mining with Cosine Similarity and Nazief-Adriani Algorithms,” TURCOMAT, vol. 12, no. 3, pp. 1415–1422, Apr. 2021, doi: 10.17762/turcomat.v12i3.938.

I. Prismana, D. Prehanto, D. Dermawan, A. Herlingga, and S. Wibawa, “Nazief & Adriani Stemming Algorithm With Cosine Similarity Method For Integrated Telegram Chatbots With Service,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 1125, no. 1, p. 012039, May 2021, doi: 10.1088/1757-899X/1125/1/012039.

B. Siswanto and Y. Dani, “Sentiment Analysis about Oximeter as Covid-19 Detection Tools on Twitter Using Sastrawi Library,” in 2021 8th International Conference on Information Technology, Computer and Electrical Engineering (ICITACEE), Semarang, Indonesia: IEEE, Sep. 2021, pp. 161–164. doi: 10.1109/ICITACEE53184.2021.9617216.

R. A. Yunmar, A. Setiawan, and H. Tantriawan, “The Combination of YAKE and Language Processing for Unsupervised Term Extraction Ontology Learning,” IOP Conf. Ser.: Earth Environ. Sci., vol. 537, no. 1, p. 012023, Jul. 2020, doi: 10.1088/1755-1315/537/1/012023.

R. Rianto, A. B. Mutiara, E. P. Wibowo, and P. I. Santosa, “Improving the Accuracy of Text Classification using Stemming Method, A Case of Informal Indonesian Conversation.” Aug. 17, 2020. doi: 10.21203/rs.3.rs-41431/v1.

M. D. Purbolaksono, F. D. Reskyadita, - Adiwijaya, A. A. Suryani, and A. F. Huda, “Indonesian Text Classification using Back Propagation and Sastrawi Stemming Analysis with Information Gain for Selection Feature,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 10, no. 1, pp. 234–238, Feb. 2020, doi: 10.18517/ijaseit.10.1.8858.

S. Fahmi, L. Purnamawati, G. F. Shidik, M. Muljono, and A. Z. Fanani, “Sentiment Analysis of Student Review in Learning Management System Based on Sastrawi Stemmer and SVM-PSO,” in 2020 International Seminar on Application for Technology of Information and Communication (iSemantic), Semarang, Indonesia: IEEE, Sep. 2020, pp. 643–648. doi: 10.1109/iSemantic50169.2020.9234291.

N. W. Wardani and P. G. S. C. Nugraha, “Stemming Teks Bahasa Bali dengan Algoritma Enhanced Confix Stripping,” IJNSE, vol. 4, no. 3, pp. 103–113, Dec. 2020, doi: 10.23887/ijnse.v4i3.30309.

D. Soyusiawaty, A. H. S. Jones, and N. L. Lestariw, “The Stemming Application on Affixed Javanese Words by using Nazief and Adriani Algorithm,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 771, no. 1, p. 012026, Mar. 2020, doi: 10.1088/1757-899X/771/1/012026.

M. S. Simanjuntak, J. Panjaitan, and S. A. Syahputra, “Using Preprocessing Text Mining With Nazief-Adriani Algorithms Similarity Of Essay Final Exam Semester,” vol. 4, no. 36, 2020.

T. Yusnitasari, I. Humaini, L. Wulandari, and D. Ikasari, “Informatian Retrieval for Popular Words in Bahasa Translation of Al Quran and Hadith Bukhori Using Enhance Confix Stripping (ECS) Stemming,” AJSEA, vol. 8, no. 1, p. 18, 2019, doi: 10.11648/j.ajsea.20190801.13.

A. Yudhana, A. Fadlil, and M. Rosidin, “Indonesian Words Error Detection System using Nazief Adriani Stemmer Algorithm,” IJACSA, vol. 10, no. 12, 2019, doi: 10.14569/IJACSA.2019.0101231.

W. Rifai and E. Winarko, “Modification of Stemming Algorithm Using A Non Deterministic Approach To Indonesian Text,” Indonesian J. Comput. Cybern. Syst., vol. 13, no. 4, p. 379, Oct. 2019, doi: 10.22146/ijccs.49072.

M. A. Muchtar et al., “Separation of Basic Words in Angkola Batak Text Documents using Enhanced Confix Stripping Stemmer Case: Mandailing Ethnic,” IOP Conf. Ser.: Mater. Sci. Eng., vol. 648, no. 1, p. 012024, Oct. 2019, doi: 10.1088/1757-899X/648/1/012024.


Article View

Abstract views : 29 times | PDF files viewed : 20 times

Dimensions, PlumX, and Google Scholar Metrics

10.33650/jeecom.v6i1.8517


Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Vinnesa Patricia Carolina, Ema Utami, Ainul Yaqin

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Creative Commons License
 
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Journal of Electrical Engineering and Computer (JEECOM)
Published by LP3M Nurul Jadid University, Indonesia, Probolinggo, East Java, Indonesia.