A Comprehensive Analysis of Adversarial Attacks on Spam Filters
Date
2024Author
Hotoğlu, Esra
xmlui.dri2xhtml.METS-1.0.item-emb
3 ayxmlui.mirage2.itemSummaryView.MetaData
Show full item recordAbstract
Email spam filters help detect malware before it reaches the mailbox and are a vital part of cyber security. Machine learning-based spam detectors have also proven to be useful and highly successful. With the advancement of artificial intelligence (AI), machine learning algorithms have become increasingly important and remain largely untested. However, adversarial learning is an important concept where the vulnerabilities of various security systems using machine learning algorithms are investigated and attempts are made to defeat machine learning models with malicious input. In the context of machine learning, including Natural Language Processing (NLP), an adversarial attack involves the deliberate manipulation of input data to cause errors or produce incorrect outputs from a machine learning model. This study investigates the feasibility of adversarial attacks against deep learning-based spam detectors. First, six prominent deep learning models are implemented, and three level attacks, namely character-, word-, and sentence-level, are analyzed in black-box scenario settings. These attacks are evaluated on three real-world spam datasets. Moreover, novel scoring functions, including spam weights and attention weights, are introduced to improve attack effectiveness. Lastly, the impact of AI-generated spam emails is investigated on the deep learning spam detection models. This comprehensive analysis sheds light on the vulnerabilities of spam filters and contributes to efforts to improve their security against evolving adversarial threats.