Search for collections on Undip Repository

ANALISIS SENTIMEN BERBASIS ASPEK UNTUK PEMANFAATAN DATA SARAN DAN MASUKAN PADA APLIKASI E-SKM JAWA TENGAH MENGGUNAKAN MODEL BERT

MUSTOFA, Refo Labib and Tarno, Tarno and Widodo, Catur Edi (2026) ANALISIS SENTIMEN BERBASIS ASPEK UNTUK PEMANFAATAN DATA SARAN DAN MASUKAN PADA APLIKASI E-SKM JAWA TENGAH MENGGUNAKAN MODEL BERT. Masters thesis, UNIVERSITAS DIPONEGORO.

[thumbnail of 1. cover awal.pdf] Text
1. cover awal.pdf

Download (173kB)
[thumbnail of 2. cover lengkap.pdf] Text
2. cover lengkap.pdf
Restricted to Repository staff only

Download (742kB)
[thumbnail of 3. BAB I.pdf] Text
3. BAB I.pdf

Download (295kB)
[thumbnail of 4. BAB II.pdf] Text
4. BAB II.pdf

Download (672kB)
[thumbnail of 5. BAB III.pdf] Text
5. BAB III.pdf
Restricted to Repository staff only

Download (528kB)
[thumbnail of 6. BAB IV.pdf] Text
6. BAB IV.pdf
Restricted to Repository staff only

Download (833kB)
[thumbnail of 7. BAB V.pdf] Text
7. BAB V.pdf
Restricted to Repository staff only

Download (256kB)
[thumbnail of 8. Daftar Pustaka.pdf] Text
8. Daftar Pustaka.pdf

Download (260kB)
[thumbnail of 9. Lampiran.pdf] Text
9. Lampiran.pdf
Restricted to Repository staff only

Download (325kB)

Abstract

Pemerintah Provinsi Jawa Tengah melalui aplikasi Elektronik Survei Kepuasan Masyarakat (E-SKM) memiliki data umpan balik yang masif dari masyarakat. Namun, fokus evaluasi cenderung pada skor kuantitatif Indeks Kepuasan Masyarakat (IKM), sementara data kualitatif berupa saran dan masukan belum termanfaatkan secara optimal. Penelitian ini bertujuan menemukan wawasan yang lebih mendalam dengan mengklasifikasikan sentimen dan topik. Metodologi penelitian ini membandingkan tiga pendekatan. Pertama, menggunakan pendekatan berbasis leksikon. Kedua, fine-tuning model IndoBERT pada dataset ground truth yang terdiri dari 3.000 data yang dianotasi secara manual oleh ahli bahasa. Ketiga, penggunaan model IndoRoBERTa yang telah melalui tahap pre-finetuning untuk klasifikasi sentimen. Untuk analisis aspek, pendekatan berbasis aturan (rule-based) dibandingkan dengan pemodelan topik menggunakan BERTopic. Hasil penelitian menunjukkan bahwa metode berbasis leksikon terbukti tidak andal, menghasilkan sentimen yang sangat bias terhadap kategori positif (64,88%) dan gagal memahami konteks kalimat seperti negasi. Sebaliknya, model berbasis transformer menunjukkan kinerja yang jauh lebih unggul. Secara spesifik, model IndoRoBERTa menghasilkan performa terbaik dengan F1-Score mencapai 0,8448 untuk kelas positif, 0,9076 untuk netral, dan 0,8472 untuk negatif, sedikit mengungguli model IndoBERT yang di-fine-tuning. Analisis kualitatif juga mengonfirmasi kemampuan model transformer dalam menafsirkan konteks secara akurat. Selain itu, BERTopic berhasil mengidentifikasi klaster-klaster tematik yang lebih kaya dan koheren dibandingkan pendekatan rule-based.
Kata kunci: Analisis sentimen berbasis aspek, E-SKM, IndoBERT, IndoRoBERTa, pemrosesan bahasa alami.

The Central Java Provincial Government possesses massive feedback data from the public, collected through the Electronic Community Satisfaction Survey (E-SKM) application. However, evaluations tend to focus on the quantitative Community Satisfaction Index (IKM) score, leaving the qualitative data, consisting of suggestions and feedback, underutilized. This research aims to uncover deeper insights by classifying sentiments and topics from this qualitative data. The research methodology involves a comparative study of three approaches. First, a lexicon-based approach was used. Second, fine-tuning the IndoBERT model on a ground-truth dataset of 3,000 samples manually annotated by a linguistic expert. Third, utilizing a pre-finetuned IndoRoBERTa model for sentiment classification. For aspect analysis, a rule-based approach was compared against topic modeling with BERTopic. The results demonstrate that the lexicon-based method is unreliable, yielding results heavily biased towards the positive category (64.88%) and failing to comprehend sentence context, such as negation. Conversely, the Transformer-based models exhibited far superior performance. Specifically, the IndoRoBERTa model achieved the best performance, reaching F1-Scores of 0.8448 for the positive class, 0.9076 for neutral, and 0.8472 for negative, slightly outperforming the fine-tuned IndoBERT model. Qualitative analysis further confirmed the Transformer models' ability to interpret context accurately. Furthermore, BERTopic successfully identified richer and more coherent thematic clusters than the rule-based approach.
Keywords: Aspect based sentiment analysis, E-SKM, IndoBERT, IndoRoBERTa, Natural Language Processing

Item Type: Thesis (Masters)
Uncontrolled Keywords: Analisis sentimen berbasis aspek, E-SKM, IndoBERT, IndoRoBERTa, pemrosesan bahasa alami.
Subjects: Sciences and Mathemathic
Divisions: Postgraduate Program > Master Program in Information System
Depositing User: ekana listianawati
Date Deposited: 10 Mar 2026 07:16
Last Modified: 10 Mar 2026 07:16
URI: https://eprints2.undip.ac.id/id/eprint/47143

Actions (login required)

View Item View Item