Fake Detection and Analysis in Tweets With Machine Learning Algorithms

Koca, Şehrinaz

dc.contributor.advisor	Çiçekli, İlyas
dc.contributor.author	Koca, Şehrinaz
dc.date.accessioned	2025-03-03T11:05:43Z
dc.date.issued	2025
dc.date.submitted	2025-01-17
dc.identifier.citation	Ş. Koca and İ. Çiçekli, Fake Detection and Analysis in Tweets with Machine Learning Algorithms	tr_TR
dc.identifier.uri	https://hdl.handle.net/11655/36634
dc.description.abstract	The widespread use and application of social media have raised concerns about whether the shared data contains accurate information or is misleading. Fake detection applications are used to detect fake Twitter(X) posts and fake news texts related to politics, catastrophic or bad events. The basis of fake detection algorithms is the process of separating the text into two classes with the model trained with the training set. In this thesis study, we first apply six different machine learning algorithms to six different datasets to compare their performance in the fake detection classification task. The performance results of these algorithms are examined in detail in terms of datasets and algorithms. Some of the datasets are English and one of them Turkish. The datasets cover a variety of fields and topics, including COVID-19, politics, the economy, earthquakes, and hurricanes. To evaluate the impact of text length, datasets containing both short tweets and longer news articles are chosen. Secondly, an analysis is conducted to determine the outcomes of using different datasets as training and testing datasets and to identify which datasets performed well when combined. Afterwards, the effects of the similarity or difference of the datasets on the results are analyzed by examining the results obtained when the train dataset obtained by combining different datasets and the test dataset is one different dataset. Finally, a Long-Short Term Memory (LSTM) model is developed based on studies involving the layers and hyperparameters used in the LSTM algorithm.	tr_TR
dc.language.iso	en	tr_TR
dc.publisher	Fen Bilimleri Enstitüsü	tr_TR
dc.rights	info:eu-repo/semantics/openAccess	tr_TR
dc.subject	Natural Language Processing	tr_TR
dc.subject	Machine Learning	tr_TR
dc.subject	Classification	tr_TR
dc.subject	Fake Detection	tr_TR
dc.subject.lcsh	Bilgisayar mühendisliği	tr_TR
dc.title	Fake Detection and Analysis in Tweets With Machine Learning Algorithms	tr_TR
dc.type	info:eu-repo/semantics/masterThesis	tr_TR
dc.description.ozet	Sosyal medya uygulama ve kullanımının yaygınlaşması, paylaşımlardaki verilerin gerçek bilgi mi yoksa yanıltıcı bilgi mi içerdiği konusunda endişeleri artırmıştır. Sahtecilik tespit uygulamaları siyasete, felaketlere veya olumsuz olaylara ilişkin sahte Twitter gönderilerini ve sahte haber metinlerini tespit etmek için kullanılmaktadır. Sahtecilik tespit algoritmalarının temeli eğitim kümesi ile eğitilmiş model kullanılarak metni iki sınıfa ayırma işlemidir. Bu tez çalışmasında sahtecilik tespiti için sınıflandırma görevi kapsamında farklı makine öğrenmesi algoritmasını altı farklı veri kümesine uygulayarak performanslarını karşılaştırdık. Bu algoritmaların performans sonuçları veri kümeleri ve algoritmalar açısından detaylı olarak incelenmiştir. Veri kümelerinin bir kısmı İngilizce, bir tanesi ise Türkçe veri kümesinden oluşmaktadır. Veri kümeleri Covid19, siyaset, ekonomi, deprem, kasırga gibi çeşitli alanları ve konuları kapsamaktadır. Metin uzunluğunun etkisini değerlendirmek için hem kısa tweetlerden hem de daha uzun haber metinlerinden oluşan veri kümeleri seçilmiştir. İkinci olarak farklı veri kümelerinin eğitim ve test kümesi olarak kullanılmasının sonuçları incelenmiş ve hangi veri kümelerinin birlikte kullanıldığında iyi sonuçlar verdiği belirlenmiştir. Daha sonra, farklı veri kümelerinin birleştirilmesiyle elde edilen eğitim veri kümesi ile tek bir farklı veri kümesinden oluşan test kümesi kullanıldığında elde edilen sonuçlar analiz edilerek veri kümelerinin benzerlik veya farklılığının sonuçlar üzerindeki etkisi değerlendirilmiştir. Son olarak, geliştirilen model Uzun-Kısa Süreli Bellek (LSTM) algoritmasına dayandırılmış ve LSTM algoritmasında kullanılan katmanlar ve hiperparametreler ile yapılan çalışmalar temel alınarak oluşturulmuştur.	tr_TR
dc.contributor.department	Bilgisayar Mühendisliği	tr_TR
dc.embargo.terms	Acik erisim	tr_TR
dc.embargo.lift	2025-03-03T11:05:43Z
dc.funding	Yok	tr_TR
dc.subtype	software	tr_TR

Bu öğenin dosyaları:

Ad:: SehrinazKoca_YüksekLisans_Tezi ...
Boyut:: 2.440Mb
Biçim:: PDF
Açıklama:: Yüksek Lisans Tez Dosyası

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Bilgisayar Mühendisliği Bölümü Tez Koleksiyonu [267]

Basit öğe kaydını göster