Mikrodizilim Gen İfade Çalışmalarında Genelleştirme Yöntemlerinin Regresyon Modelleri Üzerine Etkisi.

Yılmaz Işıkhan, Selen

dc.contributor.advisor	Reha Alpar, Celal	tr_TR
dc.contributor.author	Yılmaz Işıkhan, Selen	tr_TR
dc.date.accessioned	2015-10-14T10:37:02Z
dc.date.available	2015-10-14T10:37:02Z
dc.date.issued	2014	tr_TR
dc.identifier.uri	http://hdl.handle.net/11655/999
dc.description.abstract	The presence of thousands of gene data belonging to a few number of patients in genetic researches leads to problems in the use of classical statistical methods (linear regression analysis etc.). However, analysis of large number of genes in microarray gene expression studies simultaneously has become possible recently by using data mining methods such as support vector machine, decision tree and boosted tree. In this study, prediction performances of these methods which don't require assumptions about the data structure and can model a large number of predictors were examined on gene data. One of the basic steps for analyses which were performed on gene expression data is generalization of models of analysis. If an independent test data is not available, the approaches such as resampling the original data should be used to estimate the accuracy of prediction. Another purpose of the study is to compare the effect of bootstrap and cross validation generalization methods on model performances of support vector regression and regression trees. Two different Monte Carlo simulations were carried out for performance comparison of regression models with generalization methods. Overall, bootstrap has given more optimistic performance than cross validation. The tools that are used in the development of prediction performance of a given model building technique are model aggregating (ensemble) methods. In this study, the performances of bagging and boosting methods were also compared on the examined regression methods. Bagging has provided the improvement of regression tree for datasets having at least a number of 25 observations, "i.e." n≥25. The application of real gene data has shown consistent results with the simulation study.	tr_TR
dc.language.iso	tur	tr_TR
dc.publisher	Sağlık Bilimleri Enstitüsü	tr_TR
dc.subject	Destek vektör regresyon	tr_TR
dc.subject	Regresyon ağacı
dc.subject	GenelleĢtirme yöntemleri
dc.subject	Mikrodizilim gen verisi
dc.subject	Kestirim performansı
dc.title	Mikrodizilim Gen İfade Çalışmalarında Genelleştirme Yöntemlerinin Regresyon Modelleri Üzerine Etkisi.	tr_TR
dc.type	info:eu-repo/semantics/doctoralThesis	tr_TR
dc.callno	2014/1210	tr_TR
dc.contributor.departmentold	Biyoistatistik Anabilim Dalı	tr_TR
dc.description.ozet	Genetik araştırmalarda az sayıda hastaya ait binlerce gen verisi bulunması, klasik istatistiksel yöntemlerin (doğrusal regresyon vb.) kullanımında sorunlar ortaya çıkarmaktadır. Ancak yakın zamanda mikrodizilim gen ifade çalışmalarında çok fazla sayıdaki genin aynı anda analizi destek vektör makinaları (DVM), karar ağaçları, boosted tree gibi veri madenciliği yöntemlerinin de kullanılmasıyla mümkün hale gelmiştir. Bu çalışmada veri yapısı hakkında varsayım gerektirmeyen ve çok sayıda kestiriciyi modelleyebilen bu yöntemlerin gen verisi ile kestirim performansları incelenmiştir. Gen ifade verilerinde gerçekleştirilen analizler için en temel adımlardan biri analiz modellerinin genelleştirilmesidir. Bağımsız bir test verisi bulunmuyor ise, kestirim doğruluğunu belirlemek için, orijinal verinin yeniden örneklenmesi gibi yaklaşımlar kullanılmalıdır. Çalışmanın diğer amacı, genelleştirme yöntemlerinden bootstrap, çapraz geçerlik ve birini dışarıda bırakma yöntemlerinin DVR ve regresyon ağacı model performansları üzerine etkisini karşılaştırmaktır. Regresyon modellerinin genelleştirme yöntemleri ile performans karşılaştırmasında iki farklı Monte Carlo benzetim çalışması gerçekleştirilmiştir. Genel olarak bootstrap çapraz geçerlikten daha iyi performans vermiştir. Verilen bir model kurma tekniğinin kestirim performansını geliştirmede kullanılan araçlar ise model birleştirme (ensemble) yöntemleridir. Çalışmada ayrıca bagging ve boosting yöntemlerinin incelenen regresyon yöntemleri üzerinde performansı karşılaştırılmıştır. Bagging, gözlem sayısı n≥25 olan veri setleri için RA'da gelişme sağlamıştır. Gerçek gen verileri uygulaması benzetim çalışması ile uyumlu sonuçlar göstermiştir.	tr_TR

Bu öğenin dosyaları:

Ad:: 9afcda0b-6781-4b23-b3f3-761005 ...
Boyut:: 2.675Mb
Biçim:: PDF

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Temel Tıp Bilimleri Bölümü Tez Koleksiyonu [324]

Basit öğe kaydını göster