MIWGAN-GP: Missing Data Imputation using Wasserstein Generative Adversarial Nets with Gradient Penalty

Uçgun Ergün, Ebru

dc.contributor.advisor	Özdemir, Suat
dc.contributor.author	Uçgun Ergün, Ebru
dc.date.accessioned	2022-10-20T10:52:37Z
dc.date.issued	2022
dc.date.submitted	2022-06-03
dc.identifier.citation	UÇGUN ERGÜN, E. (2022). MIWGAN-GP: Missing Data Imputation using Wasserstein Generative Adversarial Nets with Gradient Penalty, Hacettepe University	tr_TR
dc.identifier.uri	http://hdl.handle.net/11655/26969
dc.description.abstract	The success and dependability of IoT applications are heavily dependent on data quality. Due to hardware problems, synchronization challenges, inconsistent network connectivity, and manual system shutdown, produced data might be missing, erroneous, and noisy. These missing or erroneous values can also occur on health, military and surveillance data and result in errors can also cause important errors in mission systems. If the mission critical system is used in medical domain such missing data problems may affect human life. Hence, Missing values should be imputed appropriately to avoid erroneous judgments in IoT healthcare systems and other critical systems. In addition, Naive Bayes, K-Nearest Neighbors, Decision Tree and XGboost algorithms are applied in the IoT health sector in this study to show in detail the effect of missing data on the outputs of machine learning algorithms. Following that, we compare different strategies for imputing missing data. The classification methods used were compared both for each defect percentage and with different imputation methods. In this thesis, a new GAN-based approach is proposed to complete the missing data. The success of the proposed method is compared with classical imputation methods. Error measurements are realized with four different error metrics. In addition, the success of the proposed GAN-based model is demonstrated by applying different classification methods on the data set filled with this method.	tr_TR
dc.language.iso	en	tr_TR
dc.publisher	Fen Bilimleri Enstitüsü	tr_TR
dc.rights	info:eu-repo/semantics/openAccess	tr_TR
dc.subject	Machine Learning	tr_TR
dc.subject	IoT	tr_TR
dc.subject	Generative Adversarial Networks	tr_TR
dc.subject	Missing Data	tr_TR
dc.subject	Missing Data Imputation	tr_TR
dc.subject	Deep Learning	tr_TR
dc.subject	GAN	tr_TR
dc.subject	Wasserstein GAN	tr_TR
dc.subject.lcsh	Bilgisayar mühendisliği	tr_TR
dc.title	MIWGAN-GP: Missing Data Imputation using Wasserstein Generative Adversarial Nets with Gradient Penalty	tr_TR
dc.title.alternative	MIWGAN-GP: Eksik Verilerin Gradyan Cezalandırmalı Wasserstein Çekişmeli Sinir Ağları ile tamamlanması	tr_TR
dc.type	info:eu-repo/semantics/masterThesis	tr_TR
dc.description.ozet	IoT uygulamalarının başarısı ve güvenilirliği büyük ölçüde veri kalitesine bağlıdır. Donanım sorunları, senkronizasyon zorlukları, tutarsız ağ bağlantısı ve manuel sistem kapatma nedeniyle üretilen veriler eksik, hatalı ve gürültülü olabilir. Bu eksik veya hatalı değerler sağlık, askeri ve gözetleme verisetlerinde de oluşabilmekte ve bu verilerin kullanıldığı görev sistemlerinde de önemli hatalara neden olabilmektedir. Kritik görev sistemi; tıbbi alanda kullanılıyorsa, bu tür eksik veri sorunları insan hayatını etkileyebilir. Bu nedenle, IoT sağlık sistemlerinde ve diğer kritik sistemlerde hatalı yargılardan kaçınmak için Eksik veriler uygun şekilde doldurulmalıdır. Bu çalışmada verilerin eksik olmasının makine öğrenmesi algoritmaları üzerindeki etkilerini göstermek için IoT sağlık verileri üzerinde Naive Bayes, K-Nearest Neighbors, Decision Tree ve XGboost algoritmaları uygulanmıştır. Bunu takiben, eksik verileri doldurmak için farklı stratejiler uygulanmıştır. Kullanılan sınıflandırma yöntemleri hem farklı eksiklik yüzdeleri hem de farklı atama yöntemleri ile karşılaştırılmıştır. Bu tezde, eksik verileri tamamlamak için GAN tabanlı yeni bir yaklaşım önerilmiştir. Önerilen yöntemin başarısı klasik atama yöntemleri ile karşılaştırılmıştır. Hata değerleri dört farklı hata metriği ile ölçülmüştür. Ayrıca önerilen GAN tabanlı modelin başarısı, bu yöntemle doldurulan veri seti üzerinde farklı sınıflandırma yöntemleri uygulanarak gösterilmektedir.	tr_TR
dc.contributor.department	Bilgisayar Mühendisliği	tr_TR
dc.embargo.terms	Acik erisim	tr_TR
dc.embargo.lift	2022-10-20T10:52:37Z
dc.funding	Yok	tr_TR
dc.subtype	software	tr_TR

Bu öğenin dosyaları:

Ad:: 10472528.pdf
Boyut:: 1.443Mb
Biçim:: PDF
Açıklama:: Ebru Uçgun Ergün Master Thesis

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Bilgisayar Mühendisliği Bölümü Tez Koleksiyonu [212]

Basit öğe kaydını göster