Novel Statıstıcal Approaches for Survıval Analysıs of Rna-Sequencıng Data

Göster/ Aç
Tarih
2024-04-17Yazar
Cephe, Ahu
Ambargo Süresi
Acik erisimÜst veri
Tüm öğe kaydını gösterÖzet
The number of people with cancer is increasing daily, and the mortality for cancer is constantly increasing since the biomarkers of many cancer types are unknown. Also, cancer doesn’t progress between individuals similarly, and all patients vary in response to the same treatment because of genetic differences. At this stage, it is very important to apply more effective treatments by making more accurate prognosis predictions using personalized medicine strategies. Estimating survival in cancer patients using survival time provides essential results. With the development of omics technologies, the relationship between survival time and gene expression profiles of patients can now be modeled. RNA-sequencing technology has been used in recent years for survival analysis omics-based due to its advantages. Although RNA-sequencing has many advantages, it differs from classical survival data with high-dimensionality, heterogeneity, and highly-correlated genes. Due to these problems, the regularized Cox methods and machine learning algorithms adapted to survival data are used instead of classical survival algorithms. However, the regularized Cox methods require some assumptions to be met using the Cox algorithm. Machine learning algorithms that are first created for classification problems and then adapted to survival data require additional time and effort. This study aims to develop new approaches that can be used in the survival analysis of RNA-sequencing data by combining voom transformation, stacking algorithm, and lasso methods with block structure. For this purpose, survival data can be converted into binary classification data with the stacking algorithm. Using the sample weights obtained after the voom transformation in priority-Lasso and IPF-Lasso algorithms, two new approaches are presented: voomStackPrio and voomStackIPF. Our approaches were applied to 12 real RNA-sequencing data from the TCGA database. Performance comparisons were made with other survival algorithms in the literature using Harrell’s concordance index. The results showed that the performance of the two new approaches was similar or better than other survival algorithms.