ANALISIS ULASAN HOTEL DI SITUS TRIPADVISOR MENGGUNAKAN METODE TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY DAN K-NEAREST NEIGHBOR

HUDA, Khairul and Widodo, Catur Edi and Gunawan S.K., Vincencius (2022) ANALISIS ULASAN HOTEL DI SITUS TRIPADVISOR MENGGUNAKAN METODE TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY DAN K-NEAREST NEIGHBOR. Masters thesis, School of Postgraduate Studies.

	Text COVER.pdf Download (558kB)
	Text BAB I.pdf Download (146kB)
	Text BAB II.pdf Download (520kB)
	Text BAB III.pdf Restricted to Repository staff only Download (866kB)
	Text BAB IV.pdf Restricted to Repository staff only Download (923kB)
	Text BAB V.pdf Download (136kB)
	Text DAFTAR PUSTAKA.pdf Download (260kB)
	Text LAMPIRAN.pdf Restricted to Repository staff only Download (277kB)

Abstract

Penelitian ini dilatarbelakangi oleh evaluasi produk dan jasa menggunakan metode konvensional melalui wawancara, survei dan kuisioner yang berakibat pada hasil analisis menjadi tidak akurat dan tidak konsisten. Penelitian ini bertujuan untuk menerapkan algoritma Term Frequency-Inverse Document Frequency (TF-IDF) dan K-Nearest Neighbor serta mengevalusi hasil dari sistem yang dibangun dengan tingkat akurasi yang paling optimal. Salah satu upaya untuk menanggulangi permasalahan tersebut yaitu membangun sistem untuk analisis ulasan pelanggan hotel di situs TripAdvisor yang bernilai positif, negatif dan netral menggunakan teknik text mining. Algoritma klasifikasi yang digunakan dalam penelitian adalah K-Nearest Neighbor karena memiliki kelebihan dalam hal komputasi berkinerja tinggi, tahan terhadap berbagai karakteristik data yang besar, dan memiliki kompleksitas algoritma yang relatif kecil. Hasil penelitian menunjukkan bahwa sistem dapat melakukan klasifikasi terhadap ulasan hotel di situs TripAdvisor yang bernilai positif, negatif dan netral dengan nilai performa paling baik pada K=31 dan memiliki tingkat akurasi mencapai 76% untuk data training, dan menghasilkan peningkatan akurasi mencapai 84% dengan menerapkan metode random over-sampling untuk rebalance data.
Kata kunci : text mining, term frequency-inverse document frequency, k-nearest neighbor, random over-sampling

This research is based on analysis using conventional methods through interviews, surveys and questionnaires which resulted in the analysis being inaccurate and inconsistent. This study aims to apply the Term Frequency-Inverse Document Frequency (TF-IDF) and K-Nearest Neighbor algorithms and evaluate the results of the system built with the most optimal level of accuracy. To solve these problems is to build a system for analyzing hotel customer reviews on the TripAdvisor site which have positive, negative and neutral values using text mining techniques. The classification algorithm used in the research is K-Nearest Neighbor because it has advantages in terms of high-performance computing, is resistant to various characteristics of large data, and has a relatively small algorithm complexity. The results show that the system can classify hotel reviews on the TripAdvisor site which are positive, negative and neutral with the best performance value at K = 31 and has an accuracy of 76% for training data and increase in accuracy of up to 84% by applying the random over-sampling method for data rebalance.
Keywords: text mining, term frequency-inverse document frequency, k-nearest neighbor, random over-sampling

Item Type:	Thesis (Masters)
Uncontrolled Keywords:	text mining, term frequency-inverse document frequency, k-nearest neighbor, random over-sampling
Subjects:	Sciences and Mathemathic
Divisions:	Postgraduate Program > Master Program in Information System
Depositing User:	ekana listianawati
Date Deposited:	16 Nov 2022 08:35
Last Modified:	16 Nov 2022 08:35
URI:	https://eprints2.undip.ac.id/id/eprint/9723

Actions (login required)

View Item

Search for collections on Undip Repository