• Türkçe
    • English
  • English 
    • Türkçe
    • English
  • Login
View Item 
  •   DSpace Home
  • Tıp Fakültesi
  • Temel Tıp Bilimleri Bölümü
  • Temel Tıp Bilimleri Bölümü Tez Koleksiyonu
  • View Item
  •   DSpace Home
  • Tıp Fakültesi
  • Temel Tıp Bilimleri Bölümü
  • Temel Tıp Bilimleri Bölümü Tez Koleksiyonu
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

New Approach to Unsupervised Based Classification on Microarray Data

Thumbnail
View/Open
250d42b8-022a-47c7-bd5d-0f2b58fbad64.pdf (4.340Mb)
250d42b8-022a-47c7-bd5d-0f2b58fbad64.pdf (4.340Mb)
Date
2013
Author
Coşgun, Erdal
xmlui.mirage2.itemSummaryView.MetaData
Show full item record
Abstract
Genetic studies have been an important part of medical researches in recent years. These studies have become essential for the development of personalized treatment options and discovery of new drugs. The majority of these researches have focused on obtaining gene expression data. Different methods have been developed for the analysis of gene expression data. The most important problem in the analysis of these data is that they are high dimensional to help find the expression levels for thousands of genes for the presence of a small number of individuals. Analyzing such data would not be possible with classical statistical methods because this type of data does not provide statistical assumptions. For this reason, data mining methods have been used for the analyses. According to the classical data mining approach, dimension reduction of high-dimensional data must be applied first by using Principal Component Analysis, Independent Component Analysis or Factor Analysis, then the classification, estimation or essential analysis methods such as clustering must be selected. Within the scope of this thesis, the solution has been suggested to the state of the factors of the reduced data to be similar, which is one of the missing points of this approach. In this context, the dimension has been reduced and factors have been obtained first in gene expression data, and then these structures have been analyzed by Random Forest, a most widely used tree-based method for the classification analysis. Results of this analysis were compared with the results of the use of cluster loadings obtained by size reduction proposed by the thesis study first, and then clustering factors with the Kohonen Self Organizing Map method in the Random Forest algorithm. One of the major advantages of the proposed approach is to send 1000 sub-samples selected by sampling method (bootstrap) to the Random Forest algorithm by replacing the factors clustered. In this way, both; data that could not be factorized were made more homogeneous by clustering analysis, and random selection criteria of the Random Forest method were further strengthened. The performance measures used in comparing these approaches are True Classification Rate, the F-score, Precision and Recall. Applications were carried out on two types of data: data publicly available based on 15 Gene Expression Omnibus database and 18 artificial data created for specific scenarios. The proposed method provided an average of 17.8% and 11.68% improvement for the true classification rate that is the most essential measure of comparison in data with 2 and 3 classes, and in artificial data an average of 14.5% improvement in data sets with 3 dimensions and have 3 classes with 50 individuals. The proposed method has increased the performance especially in data with less subjects and classes in terms of classification based on these findings. Software that can make all of these analyses more comfortable based on the R programming language has been developed within this thesis and the researchers will be able to carry out their own analysis.
URI
http://hdl.handle.net/11655/1001
xmlui.mirage2.itemSummaryView.Collections
  • Temel Tıp Bilimleri Bölümü Tez Koleksiyonu [199]
Hacettepe Üniversitesi Kütüphaneleri
Açık Erişim Birimi
Beytepe Kütüphanesi | Tel: (90 - 312) 297 6585-117 || Sağlık Bilimleri Kütüphanesi | Tel: (90 - 312) 305 1067
Bizi Takip Edebilirsiniz: Facebook | Twitter | Youtube | Instagram
Web sayfası:www.library.hacettepe.edu.tr | E-posta:openaccess@hacettepe.edu.tr
Sayfanın çıktısını almak için lütfen tıklayınız.
Contact Us | Send Feedback



DSpace software copyright © 2002-2016  DuraSpace
Theme by 
Atmire NV
 

 


DSpace@Hacettepe
huk openaire onayı
by OpenAIRE

About HUAES
Open Access PolicyGuidesSubcriptionsContact

livechat

sherpa/romeo

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeDepartmentPublisherLanguageRightsxmlui.ArtifactBrowser.Navigation.browse_indexFundingxmlui.ArtifactBrowser.Navigation.browse_subtypeThis CollectionBy Issue DateAuthorsTitlesSubjectsTypeDepartmentPublisherLanguageRightsxmlui.ArtifactBrowser.Navigation.browse_indexFundingxmlui.ArtifactBrowser.Navigation.browse_subtype

My Account

LoginRegister

Statistics

View Usage Statistics

DSpace software copyright © 2002-2016  DuraSpace
Theme by 
Atmire NV