Automtic Antibıotic Susceptibility and Antibiogram Prediction With Machine Learning Methods
Özet
Antimicrobial resistance (AMR) is a global health problem that poses a threat for now and poses an even greater threat for the future. Since the discovery of the first antibiotics, pathogens have developed different mechanisms of resistance against antibiotics. Today, the technology to understand the mechanisms of AMR and their genomics is more competent than ever. This thesis provides a wide range of information about using genomic data combined with machine learning for predicting antibiotic resistance and proposes a multi-model approach ASAP (Antibiotic Susceptibility and Antibiogram Prediction) for creating antibiograms. In this work, 10 different machine learning models, including Convolutional Neural Network, Nearest Neighbor, Random Forest, XGBoost, CatBoost, Naive Bayes, Support Vector Machines, Light Gradient Boosting Machine, Gradient Boost, and Logistic Regression, have been tested, evaluated, and compared for their predictive capabilities. For data preprocessing, different methods of feature extraction through n-gram encoding have been tested. For the evaluation of the models, accuracy, recall, precision, and F1 scores are used. Experiments show that models can predict the antibiotic resistance of a given pathogen sequence with up to 0.99 accuracy and 0.90+ macro average recall. The best performing model for this work has been XGBoost with 0.99 accuracy, and the least predictive model has been Naive Bayes with 0.89 accuracy. The proposed method aims to improve the current manual antibiogram creation and maintenance process and to provide healthcare professionals with valuable data for less empirical and broad-spectrum antibiotic prescribing, saving time of treatment and cost caused by AMR pathogens. This thesis shows that machine learning combined with gene sequencing can serve as a supporting tool for healthcare practices and as a surveillance tool.