Makine Öğrenmesi Algoritmaları Kullanılarak Modem Verisi Üzerinden Müşteri Memnuniyetinin Tahmini
Özet
The KNIME Analytics Platform was used throughout all processes, including data
transfer, parsing, and algorithm testing. Modem data was analyzed weekly, and
download-upload data was categorized and evaluated across six different time slots. For
classification analysis, AutoML was utilized, assessing algorithms such as Naive Bayes,
Logistic Regression, Neural Networks, Gradient Boosted Trees, Decision Trees, Random
Forest, and XGBoost. The libraries and platforms used include H2O software for
Generalized Linear Models, the Keras library for Deep Learning, and H2O AutoML for
various other algorithms.
The aim of this study is to identify dissatisfied customers. Different sampling methods
were used due to working with an unbalanced dataset. Data from modems with faulty
signal information and data from subscribers who left a service complaint were used for
labeling. The model was improved by reducing the data to four Principal Components
using Principal Component Analysis (PCA) and then enriching it with the SMOTE
(Synthetic Minority Over-sampling) technique. Tree-based algorithms yielded better results in solving the classification problem on imbalanced data. Algorithms were
evaluated based on the geometric mean of Sensitivity (TPR) and Specificity (TNR),
weighted average (WPN), and Bookmaker Informedness (BM) criteria. Due to the
closeness of the results, the False Positive (FP) rate was chosen as the final criterion to
minimize the investment cost in dissatisfied customers. XGBoost provided the best results
among the ten algorithms applied.