Tarımsal Yaz Ürünlerin Sentinel-2 Uydu Görüntülerinden Rastgele Orman Algoritması İle Nesne-Tabanlı Sınıflandırılması
Özet
In this thesis, the summer crops of the Gediz Plain have been detected using object-based classification with random forest (RF) algorithm from the multi-date Sentinel-2 satellite images of the year 2017. The detected crops are pepper, corn, eggplant, wheat, tomato, cotton, grapes, clover and olive that heavily exist in Gediz Plain. The ground truth data were collected through field works as well as using the farmer registration system (FRS). The FRS data have been checked parcel by parcel and confirmed with the fieldworks. The classification process using RF algorithm was carried out as segment based.
8 satellite images (April 10th, May 3rd, June 2nd, July 2nd, August 1st, September 7th, October 10th and November 16th), covering the study area were selected. Normalized Difference Vegetation Index (NDVI) bands were generated from each of the selected satellite images. Before the classification process, multi resolution image segmentation was carried out using the original bands (Blue, Green, Red and Near Infrared) and the NDVI bands. Then, for each image segment, the spectral features mean and standard deviation and texture measurements homogeneity, difference, and entropy were generated. Object based RF classification was carried out using the original bands, NDVI bands and feature bands. Furthermore, the NDVI images of May, July, September, and October were determined as the most important four features by using the variable importance function of the RF algorithm and the classification processes were carried out with different combinations of these four features.
Based on the achieved results, texture and standard deviation bands did not increase the accuracy of classification, on the contrary, it decreased in small quantities. The combination of features, which has the highest K ̂ (Khat-Kappa) value of 0.9365 were the original bands + NDVI bands. The K ̂ value of 0.9156 was calculated as a result of the classification with the most important four features. The most important three features May, July, and September NDVI bands were used in the single-date classifications. For these dates, the K ̂ values were computed as 0.5865, 0.6349 and 0.5738, respectively. As a result of the classification of the most important three features with their binary combinations, the K ̂ values were computed as 0.7678 for the May-NDVI and July-NDVI combination, 0.8628 for the May-NDVI and September-NDVI combination, and 0.8452 for the July-NDVI and September-NDVI combination. Since the out of bag (OOB) data, which are used for the verification in RF algorithm are selected randomly, even if other parameters are the same, the accuracy would change in each classification process. Furthermore, number of trees (ntree) and number of random features (mtry), which are the parameters of RF algorithm, affect the classification accuracy. In order to obtain the highest accuracy within all these variables, an application (Random Forest with the Highest Accuracy-RFHA) was written in R program to increase the accuracy of the RF algorithm. Depending on band combination, the rate of increase in accuracy was was up to 3%.