Search for collections on Undip Repository

ANALISIS DATA ULASAN PENGUNJUNG MENGGUNAKAN LEXICON BASED, SUPPORT VECTOR MACHINE, RANDOM FOREST DALAM MENENTUKAN SKALA PRIORITAS PEMBANGUNAN OBJEK WISATA LABUAN BAJO

DAHUR, Arnoldus Janssen and Syafei, Wahyul Amien and Prahasto, Toni (2024) ANALISIS DATA ULASAN PENGUNJUNG MENGGUNAKAN LEXICON BASED, SUPPORT VECTOR MACHINE, RANDOM FOREST DALAM MENENTUKAN SKALA PRIORITAS PEMBANGUNAN OBJEK WISATA LABUAN BAJO. Masters thesis, UNIVERSITAS DIPONEGORO.

[img] Text
Cover.pdf

Download (691kB)
[img] Text
BAB I.pdf

Download (104kB)
[img] Text
BAB II.pdf

Download (466kB)
[img] Text
BAB III.pdf
Restricted to Repository staff only

Download (277kB)
[img] Text
BAB IV.pdf
Restricted to Repository staff only

Download (2MB)
[img] Text
BAB V.pdf

Download (146kB)
[img] Text
DAFTAR PUSTAKA.pdf

Download (154kB)
[img] Text
LAMPIRAN.pdf
Restricted to Repository staff only

Download (940kB)

Abstract

Objek wisata Labuan Bajo merupakan salah satu destinasi wisata super prioritas di Indonesia. Pentingnya mendapatkan dan menganalisis ulasan pengunjung wisata untuk mengetahui preferensi berupa pandangan pengunjung terhadap fasilitas dan pelayanan yang ada saat ini. Oleh karena itu penelitian ini dilakukan untuk untuk mendapatkan dan menganalisis data ulasan pengunjung yang didapat dari website TripAdvisor dan Google Maps. Adapun metode yang digunakan dalam analisis ulasan pengunjung ini yaitu Lexicon Based untuk melakukan pelabelan, metode Support Vector Machine (SVM) dan Random Forest untuk klasifikasi. Hasil pelabelan menggunakan metode Lexicon Based didapat sentiment positif sejumlah 4187 ulasan, sentiment negatif sejumlah 1796 ulasan dan sentiment netral sejumlah 1774 ulasan. Klasifikasi dilakukan dengan menggunakan teknik undersampling dan tanpa menggunakan teknik undersampling karena ketidak seimbangan data. Hasil menggunakan teknik undersampling dengan SVM yaitu accuracy 0.89 precisi 0.95 recall 0.85 dan f1-measure 0.90 serta nilai ROC AUC menggunakan teknik undersampling yaitu 0.94 dan tanpa menggunakan teknik undersampling accuracy 0.79 presisi 0.80 recall 0.94 dan f1-measure 0.86 serta nilai ROC AUC yaitu 0.83. Hasil menggunakan teknik undersampling dengan Random Forest yaitu accuracy 0.87 precisi 0.91 recall 0.86 dan f1-measure 0.88 serta nilai ROC AUC menggunakan teknik undersampling yaitu 0.93 dan tanpa menggunakan SMOTE accuracy 0.77 presisi 0.78 recall 0.94 dan f1-measure 0.85 serta nilai ROC AUC yaitu 0.81. Penentuan skala prioritas dilakukan dengan mendapatkan 10 kata teratas dan jumlah sentiment dari masing-masing kata yang berkaitan dengan pembangunan didapat kata-kata sentimen positif yang sering muncul yaitu ’indah’, ’alami’, ’eksotik’,’pandang’, ’bersih’,’ purba’, ’takjub’, ’sejarah’. Pelestarian aset alami dan aset sejarah tentunya harus dijaga dan terus dipertahankan. Sebaliknya kata-kata negatif yang sering muncul yaitu ’mahal’, ’biaya’, ’pandu’, ’jalan’, ’sampah’, ’panas’. Berdasarkan kata tersebut pembangunan transportasi dan infrastruktur tentunya sangat diperlukan dalam peningkatan daya tarik wisata Labuan Bajo.
Kata kunci: Analisis Ulasan, Labuan Bajo, Lexicon Based, Support Vector Machine, Random Forest

Labuan Bajo tourist destination is one of the super priority tourist destinations in Indonesia. The importance of obtaining and analyzing tourists' reviews is to understand their preferences and views on the existing facilities and services. Therefore, this research is conducted to obtain and analyze visitor review data obtained from TripAdvisor and Google Maps. The methods used in analyzing these visitor reviews are Lexicon-Based for labeling, Support Vector Machine (SVM), and Random Forest for classification. The labeling results using the Lexicon-Based method showed 4187 positive reviews, 1796 negative reviews, and 1774 neutral reviews. The classification was performed using undersampling technique and without using undersampling techniquedue to data imbalance. Results using undersampling technique with SVM showed an accuracy of 0.89, precision of 0.95, recall of 0.85, and f1-measure of 0.90, with an ROC AUC value of 0.94. Without using undersampling technique, the accuracy was 0.79, precision was 0.80, recall was 0.94, and f1- measure was 0.86, with an ROC AUC value of 0.83. Results using undersampling techniquewith Random Forest showed an accuracy of 0.87, precision of 0.91, recall of 0.86, and f1-measure of 0.88, with an ROC AUC value of 0.93. Without using undersampling technique, the accuracy was 0.77, precision was 0.78, recall was 0.94, and f1-measure was 0.85, with an ROC AUC value of 0.81.The determination of priority scale was done by obtaining the top 10 words and the number of sentiments related to development. The frequently occurring positive sentiment words were 'beautiful,' 'natural,' 'exotic,' 'scenic,' 'clean,' 'ancient,' 'amazed,' and 'historical.' The preservation of natural and historical assets must be maintained and continuously preserved.On the other hand, the frequently occurring negative words were 'expensive,' 'cost,' 'guide,' 'road,' 'garbage,' and 'hot.' Based on these words, the development of transportation and infrastructure is undoubtedly needed to enhance the attractiveness of Labuan Bajo as a tourist destination.
Keywords: Review Analysis, Labuan Bajo, Lexicon Based, Support Vector Machine, Random Forest

Item Type: Thesis (Masters)
Uncontrolled Keywords: Analisis Ulasan, Labuan Bajo, Lexicon Based, Support Vector Machine, Random Forest
Subjects: Sciences and Mathemathic
Divisions: Postgraduate Program > Master Program in Information System
Depositing User: ekana listianawati
Date Deposited: 06 May 2024 04:22
Last Modified: 06 May 2024 04:22
URI: https://eprints2.undip.ac.id/id/eprint/22802

Actions (login required)

View Item View Item