Aykırı Değerler Varlığında Sınıflandırma Yöntemleri

Aşlar Kırmızı, Cemile

View/Open

Yüksek Lisans Tez (1.478Mb)

Date

2023

Author

Aşlar Kırmızı, Cemile

xmlui.dri2xhtml.METS-1.0.item-emb

Acik erisim

xmlui.mirage2.itemSummaryView.MetaData

Show full item record

Abstract

In the field of data mining, the importance of classification techniques categorized under supervised learning is increasing with the constantly changing, diversifying and multiplying data. This variety in data creates the need for change and advancement of classification techniques. Robust classification techniques under basic classification techniques are being developed in order to reduce misclassification rates. In the presence of outliers caused by advancements and changes in the data, finding the right classification method increases its importance day by day. In this study, some classification techniques gathered under machine learning were examined. Analyzes were made and interpreted on simulation and real data sets with algorithms that are most used in the literature and described as successful in the sources. Threshold Metrics were used to interpret the prediction errors of classification algorithms by digitizing them. The success of classification algorithms was evaluated based on these data by calculating sensitivity, specificity, overall accuracy and F1-scores. By diversifying the evaluation, analyzes were made on 4 types of simulation data sets and 2 different real data. The results of the analysis were charted, interpreted and graphical representations of the F1-scores were used. In the analyzes using simulation datasets, the successes of logistic regression, robust logistic regression with similar features, tangentboost, gudermannianboost algorithms, robust linear discriminant analysis (RLDA) and robust quadratic discriminant analysis (RQDA), robust linear discriminant analysis with OGK estimator (RLDA-OGK) came forward. In the analysis results where real datasets were studied, logistic regression, robust logistic regression, tangentboost, gudermannianboost algorithms, sensitivity, specificity, overall accuracy and F1-scores were all found to be successful with a significant margin.

URI

https://hdl.handle.net/11655/33392

xmlui.mirage2.itemSummaryView.Collections

İstatistik Bölümü Tez Koleksiyonu [130]