Investigation of Imbalance Problem Effects
on Text Categorization

Naderalvojoud, Behzad

Göster/Aç

c3db2b40-95c8-47dc-a02a-a2e8dc10e83e.pdf (2.827Mb)

Tarih

2015

Yazar

Naderalvojoud, Behzad

Üst veri

Tüm öğe kaydını göster

Özet

Text classification is a task of assigning a document into one or more predefined categories based on an inductive model. In general, machine learning algorithms assume that datasets consist of almost homogeneous class distribution. However, learning methods can be tended to the classification which has poorly performance over the minor categories while using imbalanced datasets. In multi-class classification, major categories correspond to the classes with the most number of documents and also minor ones correspond to the classes with the lowest number of documents. As a result, text classification is the process which can be highly affected from the class imbalance problem. In this study, we tackle this problem using category based term weighting approach in combination with an adaptive framework and machine learning algorithms.

Bağlantı

http://hdl.handle.net/11655/2582

Koleksiyonlar

Ağaç İşleri Endüstri Mühendisliği Bölümü Tez Koleksiyonu [23]