Derin Öğrenme ile Plankton Sınıflandırması
Özet
Plankton which is at the bottom of the food chain in the aquatic ecosystem and is responsible for producing approximately half of the oxygen in the atmosphere, is one of the most important components of life on earth. Plankton distributions are seen as an important precursor for climate changes such as global warming. Therefore, it is very critical to follow their distribution, to analyze and to have information about water quality. With the development of plankton imaging technologies, a large number of plankton images are obtained from aquatic ecosystems. Traditional manual classification systems are unable to meet the ever-growing plankton dataset classification requirements. Manual classification systems are methods that require specialist knowledge and are very time consuming. For this reason, the large number of data obtained and the importance of knowing the distribution of plankton increase the need for automated systems that classify data day by day. Within the scope of the thesis, two different versions of 118 classes and 38 classes of the dataset consisting of 30,336 plankton images belonging to 121 classes published in the competition aiming to automatically detect plankton and organized on the Kaggle platform, called the National Data Science Bowl, were used. In this study, deep convolutional neural networks namely Inceptionv3, InceptionResNetv2, DenseNet, ResNet-50, VGG-16 networks were trained to classify plankton images; transfer learning, fine tuning, training of all layers methods were used and the contribution of data augmentation and preprocessing on classification was observed. With the two different versions of the Kaggle dataset, in the models trained on the networks, the highest performance of 78% was achieved in the first version with 118 classes, and 92% performance in the second version with 38 classes. By using the average ensemble method, the performance increased by 2% in the first version and 1% in the second version of the dataset.