PERBANDINGAN PERFORMA RANDOM FOREST DAN GRADIENT BOOSTING DALAM PREDIKSI PADA DATASET CUSTOMER SHOPPING TRENDS

Main Article Content

Ferdiana Putri
Dede Brahma Arianto

Abstract

            This study compares the performance of two machine learning algorithms, Random Forest and Gradient Boosting, in predicting product categories using the Customer Shopping Trends dataset. The dataset exhibits an imbalanced class distribution, prompting the use of oversampling techniques to improve the representation of minority classes. The analysis process involves data exploration, preprocessing, and model evaluation based on metrics such as accuracy, precision, recall, and F1-score. The results indicate that Random Forest provides more consistent and superior performance compared to Gradient Boosting, particularly in handling minority classes. The Random Forest model achieved higher accuracy and more balanced evaluation metrics across all classes. This study offers insights into the effectiveness of ensemble algorithms in addressing data imbalance and their relevance for practical applications in industries such as e-commerce and customer data analysis.


 


            Penelitian ini membandingkan performa dua algoritma pembelajaran mesin, yaitu Random Forest dan Gradient Boosting, dalam memprediksi kategori produk pada dataset Customer Shopping Trends. Dataset ini memiliki distribusi kelas yang tidak merata, sehingga teknik oversampling digunakan untuk meningkatkan representasi kelas minoritas. Proses analisis melibatkan eksplorasi data, pra-pemrosesan, dan evaluasi model berdasarkan metrik akurasi, precision, recall, dan F1-score. Hasil penelitian menunjukkan bahwa Random Forest memberikan performa yang lebih konsisten dan unggul dibandingkan Gradient Boosting, terutama dalam menangani kelas minoritas. Model Random Forest berhasil mencapai akurasi yang lebih tinggi dan nilai metrik evaluasi yang lebih seimbang pada seluruh kelas. Penelitian ini memberikan wawasan tentang efektivitas algoritma ensemble dalam menghadapi ketidakseimbangan data, serta relevansinya untuk aplikasi praktis di industri, seperti e-commerce dan analisis data pelanggan.

Article Details

How to Cite
Putri, F., & Arianto, D. B. (2024). PERBANDINGAN PERFORMA RANDOM FOREST DAN GRADIENT BOOSTING DALAM PREDIKSI PADA DATASET CUSTOMER SHOPPING TRENDS. Kohesi: Jurnal Sains Dan Teknologi, 5(11), 1–10. https://doi.org/10.3785/kohesi.v5i11.9030
Section
Articles
Author Biographies

Ferdiana Putri, Universitas Muhammadiyah Surakarta

Manajemen, Fakultas Ekonomi, Universitas Muhammadiyah Surakarta

Dede Brahma Arianto, Universitas Faletehan

Informatika, Fakultas Sains dan Teknik, Universitas Faletehan

References

Development, S., Kauffman, I., Khai, L., Lien, T., & Jr, P. R. F. (2019). Fernandez, P. R. 2018. Literature Review on Sustainability Science for HIGHLIGHTS OF A LITERATURE REVIEW ON SUSTAINABILITY SCIENCE FOR SUSTAINABLE DEVELOPMENT AND ITS IMPLICATIONS TO THE ASIA- PACIFIC REGION. September.

Dorogush, A. V., Ershov, V., & Gulin, A. (n.d.). CatBoost: Gradient Boosting with categorical features support. 1–7.

Gans, J. S. (2019). Artificial Intelligence: The Ambiguous Labor Market Impact of Automating Prediction.

Hameed, A., & Bawany, N. Z. (2022). Network intrusion detection using oversampling technique and machine learning algorithms. https://doi.org/10.7717/peerj-cs.820

Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data. https://doi.org/10.1186/s40537-019-0192-5

Kar, A. K., Choudhary, S. K., & Singh, V. K. (2022). How can artificial intelligence impact sustainability: A systematic literature review. Journal of Cleaner Production, 376(May), 134120. https://doi.org/10.1016/j.jclepro.2022.134120

Liu, X., Tong, D., Huang, J., Zheng, W., Kong, M., & Zhou, G. (2022). Land Use Policy What matters in the e-commerce era? Modelling and mapping shop rents in. Land Use Policy, 123(February), 106430. https://doi.org/10.1016/j.landusepol.2022.106430

Marpaung, F., Khairina, N., & Muliono, R. (2024). KLASIFIKASI DAUN TEH SIAP PANEN MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORK ARSITEKTUR MOBILENETV2. 18, 215–225.

Med, J. T. (2020). Comments on: Huang et al. (2019) Emerging trends and research foci in gastrointestinal. Journal of Translational Medicine, 1–2. https://doi.org/10.1186/s12967-020-02379-9

Meidianingsih, Q., & Agustine, D. (2021). Study of Bagging Application in the Safe-Level Smote Method in Kajian Penerapan Bagging pada Metode Safe-Level Smote dalam. 5(1), 105–116.

Oktavianus, A. J. E., Naibaho, L., & Rantung, D. A. (2023). Pemanfaatan Artificial Intelligence pada Pembelajaran dan Asesmen di Era Digitalisasi. 05(2), 473–486.