Yapay Sinir Ağı, Karar Ağaçları ve Ayırma Analizi Yöntemleri ile PISA 2012 Matematik Başarılarının Sınıflandırılma Performanslarının Karşılaştırılması
Özet
This study aims to compare the performance of artificial neural network, decision trees and discriminant analysis methods to classify student success. The performance of each method is investigated in different sample sizes when classifying into different numbered sub-groups. The participants of the study are all the students who took part in PISA 2012 mathematics test. The mathematics test scores and data from the questionnaires of all the students from all over the world who took part in PISA 2012 are used to compare student success. The study uses multilayer perceptron model to form the artificial neural network model, CHAID algorithm to apply the decision trees method and linear discriminant analysis.
The six sub-groups are formed as Level1/Level2/Level3/Level4/Level5/Level6. Each performance level in PISA classification forms a sub-group. The three sub-groups are named as Below the Average, Average, and Above the Average. Below the Average group includes students from Level1 and Level2, Average group includes students from Level3 and Level4 and Above the Average group includes students from Level5 and Level6. When classifying into 2 sub-groups, students from Level1, Level2 and Level3 form Below the Average group and students from Level4, Level5 and Level6 form Above the Average group.
The study reveals that the artificial neural network has the best performance in big, medium and small sample sizes when classifying into six, three and two sub-groups. In the very small sample size which has homogeneous variance-covariance matrixes, discriminant analysis performs the best, while in the very small sample size which does not have homogeneous variance-covariance matrixes, it is the discriminant analysis which performs the best when classifying into six sub-groups and artificial neural network performs the best when classifying into two and three sub-groups. The performance of the methods in the very small sample size which has homogeneous variance-covariance matrixes is better than that of the very small sample size which does not have homogeneous variance-covariance matrixes. Considering the performance of the methods with respect to sample size, it can be concluded that as the sample size gets smaller, the performance of decision trees method gets worse, whereas the performance of discriminant analysis method improves. When it comes to artificial neural network method, no correlation of this kind is found. Because the artificial neural network method divides the data into two groups, the education set and test set, in each application, the method performs differently in each application under the same circumstances. The findings of the study suggest that in artificial neural network method applications, it is better to carry out as many trials as possible in order to achieve the highest performance. It is also found that big sample sizes enable a better performance when the decision trees method is used. Finally, in discriminant analysis applications, working with a small sample size and homogeneous variance-covariance matrixes improves the performance of the method twice as much.