Search for collections on Undip Repository

INTEGRASI RANDOM FOREST, ADASYN, DAN SHAP UNTUK PREDIKSI DIABETES DAN INTERPRETABILITAS

AULIA, Hozana and Wibowo, Adi and Sutrisno, Sutrisno (2025) INTEGRASI RANDOM FOREST, ADASYN, DAN SHAP UNTUK PREDIKSI DIABETES DAN INTERPRETABILITAS. Masters thesis, UNIVERSITAS DIPONEGORO.

[thumbnail of Cover-1.pdf] Text
Cover-1.pdf

Download (118kB)
[thumbnail of Cover.pdf] Text
Cover.pdf
Restricted to Repository staff only

Download (1MB)
[thumbnail of Bab I.pdf] Text
Bab I.pdf

Download (355kB)
[thumbnail of Bab II.pdf] Text
Bab II.pdf

Download (694kB)
[thumbnail of Bab III.pdf] Text
Bab III.pdf
Restricted to Repository staff only

Download (480kB)
[thumbnail of Bab IV.pdf] Text
Bab IV.pdf
Restricted to Repository staff only

Download (989kB)
[thumbnail of Bab V.pdf] Text
Bab V.pdf
Restricted to Repository staff only

Download (343kB)
[thumbnail of Daftar Pustaka.pdf] Text
Daftar Pustaka.pdf

Download (325kB)
[thumbnail of Lampiran.pdf] Text
Lampiran.pdf
Restricted to Repository staff only

Download (409kB)

Abstract

Diabetes merupakan salah satu penyakit kronis dengan prevalensi yang tinggi secara global dan berpotensi menyebabkan komplikasi serius jika tidak dideteksi sejak dini. Salah satu tantangan dalam pengembangan model prediksi diabetes adalah ketidakseimbangan data serta kurangnya interpretasi dari hasil prediksi yang dihasilkan oleh model berbasis machine learning. Penelitian ini bertujuan untuk mengembangkan integrasi metode random forest, ADASYN, dan SHAP sebagai pendekatan terpadu dalam membangun model klasifikasi diabetes yang tidak hanya akurat, tetapi juga dapat dijelaskan secara lebih transparan. Dataset yang digunakan pada penelitian ini adalah PIMA Indians, dimana dataset ini tidak seimbang. Metode penelitian mencakup preprocessing data, penyeimbangan kelas menggunakan ADASYN, pelatihan model menggunakan random forest, serta interpretasi hasil prediksi dengan SHAP. Hasil penelitian menunjukkan bahwa model menghasilkan akurasi sebesar 82%, precision 70%, recall 83% dan F1-score 76%. Fitur-fitur Glucose, BMI, dan Age ditemukan sebagai kontributor utama terhadap prediksi. Analisis SHAP dilanjutkan dengan mengidentifikasi bahwa individu dengan usia 30-50 tahun dan obesitas merupakan kelompok berisiko tinggi. Model yang dibangun memberikan kontribusi terhadap sistem informasi prediktif berbasis machine learning yang tidak hanya efektif, tetapi juga dapat dipertanggungjawabkan secara interpretatif untuk mendukung keputusan klinis.
Kata Kunci: Prediksi Diabetes, Random Forest, ADASYN, SHAP, Interpretabilitas, Machine Learning

Diabetes is one of the chronic diseases with a high prevalence globally and has the potential to cause serious complications if not detected early. One of the challenges in developing a diabetes prediction model is the imbalance of data and the lack of interpretation of the prediction results generated by machine learning-based models. This research aims to develop the integration of random forest, ADASYN, and SHAP methods as an integrated approach in building a diabetes classification model that is not only accurate, but also more transparently explainable. The dataset used in this study is PIMA Indians, where this dataset is unbalanced. The research methods include data preprocessing, class balancing using ADASYN, model training using random forest, and interpretation of prediction results with SHAP. The results showed that the model produced an accuracy of 82%, precision 70%, recall of 81% and an F1-score of 75%. Features Glucose, BMI, and Age were found to be major contributors to the prediction. The SHAP analysis was followed by identifying that individuals aged 30-50 years and obesity were a high-risk group. The model built contributes to a predictive information system based on machine learning that is not only effective, but also interpretively accountable to support clinical decisions.
Keywords: Diabetes Prediction, Random Forest, ADASYN, SHAP, Interpretability, Machine Learning

Item Type: Thesis (Masters)
Uncontrolled Keywords: Prediksi Diabetes, Random Forest, ADASYN, SHAP, Interpretabilitas, Machine Learning
Subjects: Sciences and Mathemathic
Divisions: Postgraduate Program > Master Program in Information System
Depositing User: ekana listianawati
Date Deposited: 10 Dec 2025 08:21
Last Modified: 10 Dec 2025 08:21
URI: https://eprints2.undip.ac.id/id/eprint/42034

Actions (login required)

View Item View Item