Parametrik Ve Parametrik Olmayan Madde Tepki Kuramı Modellerinin Farklı Örneklemler Ve Test Uzunluğunda Karşılaştırılması
Özet
This research aimed to identify, for polytomous data, the effects of independent variables as sample size, sample distribution, the number of items in the test and the number of response categories of items in the test on estimations achieved by Graded Response Model (GRM) under Parametric Item Response Theory (PIRT) and by Monotone Homogeneity Model (MHM) under Non-Parametric Item Response Theory (NPIRT). To achieve this aim, the research was performed as a basic study of which 7200 simulation conditions determined by the variables as sample size, sample distribution, the number of items and the number of categories of items were designed. Estimates, for conditions as sample size (N= 100, 250, 500, 1000), sample distribution (normal, -0.5 skewed, -1.0 skewed), the number of items (10, 20, 40, 80) and the number of categories of items (3, 5, 7), which were achieved by GRM and MHM were examined by respectively calculating model data fit, reliability values, item parameters, errors of parameters and bias values.
As a result of the research, that values were affected by increase of variables while the model data fit was calculated at GRM and that values cannot be interpreted alone made comparison and generalization of those values difficult. The practical calculation of model data fit and interpretation without the need for another value at MHM provided superiority over GRM. Due to giving similar results at small samples and at conditions with fewer items to conditions with larger samples and multiple items, MHM had a wider range of implementation.
Another research result was that the reliability values gave similar results for both models. The increase in sample size had little effect on reliability value. In both models, values increased by increase of both the number of items and number of item response category. Reliability value of estimates decreased as the distribution skewness increased.
In the results related to the item parameters, the correlation of GRM parameters with true parameters increased as the number of items, the number of categories of items and sample size increased. The number of items of discrimination (a) parameters showed tendency to decrease as the number of categories of items and sample size increased. This pattern decreased as the distribution skewness increased. The variation in threshold (b) parameters did not show a certain pattern. Standard error, RMSE, bias values of parameters were estimated higher at small samples and at conditions with fewer items. The error and bias level increased when the distribution skewness increased. In MHM, as the number of items, the number of categories of items and the sample size increased the scalability (H) coefficient and difficulty (P) values did not show a statistically significant change. In general, it could be suggested that values were close to each other. The standard errors of the parameters calculated for MHM were very low compared to GRM at small sample and short test conditions and took close values to each other at all the conditions.
In conclusion, that parameter estimates by GRM were highly correlated with true values, more reliable and less faulty was owing to that the distribution showed normality and that sample size with at least 500 was provided. It was concluded that the number of items which was at least 20 and the number of category which was at least 5 were effective factors in providing the parameter goodness in the estimates. Thus, in case the research conditions do not allow changes at sample size, the number of sample items or the number of item categories, it can be suggested that MHM which provides less faulty and more stable estimations at all the conditions can be preferred.