Çok Boyutlu Testlerin Tek Boyutlu ve Çok Boyutlu Yöntemlere Göre Eşitlenmesi
Göster/ Aç
Tarih
2023Yazar
Zor, Yaşar Mehmet
Ambargo Süresi
Acik erisimÜst veri
Tüm öğe kaydını gösterÖzet
In this study, multidimensional and unidimensional scale transformation procedures were performed on multidimensional test forms under various conditions, and it was aimed to compare the equating errors obtained from item and ability parameters. Equating error (RMSE) was used as an evaluation criterion. Two-dimensional and simple structured test forms were produced in R software. IRTPRO software was used for estimation of item and ability parameters, Linkmirt software was used for multidimensional scale transformation, and IRTEQ software was used for unidimensional scale transformation. Sample size (1000 and 2000), correlation between dimensions (0.1; 0.5 and 0.9), common item ratio (20% and 40%), difference in ability distribution between groups (0.05 and 0.5) and parameter estimation model (2PLM and 3PLM) were taken as study conditions. Multidimensional and unidimensional scale transformation procedures were performed in the non-equivalent groups anchor test design and the RMSE values obtained were compared. The statistical significance of the difference between the RMSE values according to the conditions and their interactions was examined by means of multi-way analysis of variance and t-test. Lower RMSE values were obtained when the sample size was 2000 and the common item ratio was 40%. In multidimensional and unidimensional scale transformation, lower RMSE values were estimated when the correlation between dimensions was 0.1 and 0.9, respectively. RMSE values are lower when the difference in ability distribution between groups is low. Lower RMSE values were obtained when the mean-mean method was used in multidimensional scale transformation and Stocking-Lord method was used in unidimensional scale transformation.
Bağlantı
https://hdl.handle.net/11655/34068Koleksiyonlar
Künye
Ackerman, T. (1989). Unidimensional IRT calibration of compensatory and noncompensatory items. Applied Psychological Measurement, 13, 113-127. Ackerman, T. A., Gierl, M. J., & Walker, C. M. (2003). Using multidimensional item response theory to evaluate educational and psychological tests. Educational Measurement: Issues and Practice, 22(3), 37-51. Angoff, W. H. (1984). Scales,norms and equivalent scores. New Jersey: Educational Testing Service. Atar, B. ve Yeşiltaş, G. (2017). Çok boyutlu eşitleme yöntemlerinin eşdeğer olmayan gruplarda ortak madde deseni için performanslarının incelenmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 8(4), 421-434. Baker, F. B. & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28(2), 147–162. Braun, H. I and Holland, P. W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland and D.B. Rubin (Ed.), Test equating (s. 9-49). New York: Academic Press. Beguin, A. A., & Hanson, B. A. (2001). Effect of noncompensatory multidimensionality on separate and concurrent estimation in IRT observed score equating. Paper presented at the The Annual Meeting of the National Council on Measurement in Education, Seattle, WA. Bökeoğlu, Ö., Uçar, A. ve Balta, E. (2022). Madde tepki kuramına dayalı gerçek puan eşitlemede ölçek dönüştürme yöntemlerinin incelenmesi. Ankara Üniversitesi Eğitim Bilimleri Fakültesi Dergisi. 55(1), 1-36. Brossman, B. G. (2010). Observed score and true score equating procedures for multidimensional item response theory (Doctoral dissertation). University of Iowa, Iowa. Brossman, B. G., & Lee, W. (2013). Observed score and true score equating procedures for multidimensional item response theory. Applied Psychological Measurement, 37, 460-481. Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48, 1-29. Chu, K. L. & Kamata, A. (2003). Test equating with the presence of DIF. Paper presented at the annual meeting of American Educational Research Association, Chicago. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum Cook L. L. & Eignor R. E. (1991). An NCME instructional module on IRT equating methods. Instructional topics in educational measurement. Educational Measurement: Issues and Practice, 10(1), 37-45. Crocker, L., and Algina, J. (1986). Introduction to classical and modern test theory. New York: Holt, Rinehart & Winston. de Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Press. DeMars, C. (2016). Madde Tepki Kuramı. Ankara: Nobel Akademi Yayıncılık. Dorans, N. J., & Holland, P. W. (2000). Population invariance and the equatability of tests: basic theory and the linear case. Journal of Educational Measurement, 37(4), 281-306. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Elbaum Associates, Publishers. Gibbons, R. D., Immekus, J., ve Bock, R. D. (2007). Didactic workbook: The added value of multidimensional IRT models. National Cancer Institute Technical Report. Gök, B. ve Kelecioğlu, H. (2014). Denk olmayan gruplarda ortak madde deseni kullanılarak madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması. Mersin Üniversitesi Eğitim Fakültesi Dergisi, 10(1), 120-136 Gübeş, N. Ö. (2019). Test Eşitlemede Çok Boyutluluğun Eş Zamanlı ve Ayrı Kalibrasyona Etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 34(4), 1061-1074. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage. Hanson, B. A., & Beguin, A.A. (2002). Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26(1), 3-24. Huggins, A. C. (2012). The effect of differential item functioning on population invariance of item response theory true score equating (Doctoral dissertation). University of Miami, Coral Gables. Kabasakal, K. A. (2014). Değişen Madde Fonksiyonunun Test Eşitlemeye Etkisi (Doktora tezi). Hacettepe Üniversitesi Eğitim Bilimleri Enstitüsü, Ankara. Kaskowitz, G. S., & De Ayala, R. J. (2001). The effect of error in item parameter estimates on the test response function method of linking. Applied Psychological Measurement, 25, 39-52. Kilmen, S. and Demirtaşlı, N. (2012). Comparison of test equating methods based on item response theory. Procedia-Social and Behavioral Sciences 46(2012), 130-134 according to the sample size and ability distribution Kim, S., & Kolen, M.J. (2006). Robustness to format effects of IRT linking methods for mixed-format tests. Applied Measurement in Education, 19(4), 357-381. Kim, S., & Lee, W. (2006). An extension of four IRT linking methods for mixed- format tests. Journal of Educational Measurement, 43(1), 53-76. Kim, S. Y. (2018). Simple structure MIRT equating for multidimensional tests (Doctoral Dissertation). University of Iowa, Iowa. Kim, S., Lee, W. C. ve Kolen, M. J. (2020). Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests. Educational and Psychological Measurement. 80(1), 91-125. Kim, S. & Lee, W. (2023). Several Variations of Simple-Structure MIRT Equating. Journal of Educational Measurement. https://doi.org/10.1111/jedm.12341 Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking: Methods and practices (3rd edition). New York: Springer. Kreiter, C.D. (1993). An emprical iInvestigation of compensatory and noncompensatory test items in simulated and real data (Doctoral Dissertation). The University of Iowa, Iowa. Kumlu, G. (2019). Test ve alt testlerde eşitlemenin farklı koşullar açısından incelenmesi (Doktora tezi). Hacettepe Üniversitesi Eğitim Bilimleri Enstitüsü, Ankara. Lee, E., Lee, W., & Brennan, R. L. (2014). Equating multidimensional tests under a random groups design: A comparison of various equating procedures. (CASMA Research Report No. 40). Iowa City, IA: Center for Advanced Studies in Measurement and Assessment, The University of Iowa. Livingston, S. A. (2004). Equating test scores (Without IRT) (2nd edition). Educational Testing Service. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrance Erlbaum Associates, Inc. Loyd, B. H., & Hoover, H. D. (1980). Vertical equating using the rasch model. Journal of Educational Measurement, 17(3), 179-193. Marco, G. L. (1977). Item characteristic curve solutions to three intractable testing problems. Journal of Educational Measurement, 14(2), 139-160. Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51(1), 1-23. Öztürk Gübeş, N. (2014). Testlerin boyutluluğunun, ortak madde formatının, yetenek dağılımının ve ölçek dönüştürme yöntemlerinin karma testlerin eşitlenmesine etkisi (Doktora tezi). Hacettepe Üniversitesi Eğitim Bilimleri Enstitüsü, Ankara. Reckase, M. D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401-412. Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer. Skaggs, G., and Lissitz, R. W. (1986). IRT test equating: Relevant issues and a review of recent research. Review of Educational Research, 56(4), 495-529. Smith, J. (2009). Some issues in item response theory: Dimensionality assessment and models for Guessing (Unpublished Doctoral Dissertation). University of South California. Speron, E. (2009). A comparison of metric linking procedures in Item Response Theory (doctoral dissetration). University of Illinois, Chicago, Illinois. Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201-210. Tian, F. (2011). A comparison of equating/linking using the Stocking-Lord method and concurrent calibration with mixed-format tests in the non-equivalent groups common-item design under IRT (Doctoral dissertation). Boston College University, Boston. TÜBİTAK (2002). Araştırma ve Deneysel Geliştirme Taramaları İçin Önerilen Standart Uygulama. https://www.tubitak.gov.tr/tubitak_content_files/BTYPD/kilavuzlar/frascati_tr.pdf adresinden erişilmiştir. Wang, T., Lee, W. C., Brennan, R. J., & Kolen, M. J. (2008). A comparison of the frequency estimation and chained equipercentile methods under the common-item non-equivalent groups design. Applied Psychological Measurement, 32, 632-651. Yao, L. (2009). LinkMIRT: Linking of multivariate item response model. Monterey, CA: Defense Manpower Data Center. Yao, L., & Boughton, K. A. (2009). Multidimensional Linking for Tests with Mixed Item Types. Journal of Educational Measurement, 46(2), 177-197. Xu, Y. (2009). Measuring change in jurisdiction achievement over time: Equating issues in current international assessment programs (Doctoral dissertation). University of Toronto, Toronto. Zhang, B. (2009). Application of unidimensional item response models to tests with item sensitive to secondary dimensions. The Journal of Experimental Education, 77(2), 147-166. Zhang, J. (2012). Calibration of response data using MIRT models with simple and mixed structures. Applied Psychological Measurement, 36(5), 375-398. Zhu, W. (1998). Test equating: What, why, who?. Research Quarterly for Exercise and Sport, 69(1), 11-23.İlgili öğeler
Başlık, yazar, küratör ve konuya göre gösterilen ilgili öğeler.
-
Tek Boyutlu ve Çok Boyutlu Aşamalı Tepki Modeline Göre Çok Boyutlu Yapıların İncelenmesi
Demir, Elif Kübra (Eğitim Bilimleri Enstitüsü, 2019)The aim of this study is to compare estimated item and ability parameters and the model-data fit indexes of multidimensional and polytomous item data based on unidimensional and multidimensional Graded Response Model (GRM). ... -
Çok Boyutlu Test Deseninin ve Kalibrasyon Yöntemlerinin Çok Boyutlu Bireyselleştirilmiş Bilgisayar Uygulamalarına Etkisi
Özberk, Eren Halil (Eğitim Bilimleri Enstitüsü, 2016)A test can be designed for many purposes, including the ranking of people along a continuum or providing diagnostic value about examinees. However, a very common problem that often arises is the reporting diagnostic subscores ... -
Yarı Karışık Yapılı Çok Boyutlu Yapıların Tek Boyutlu Olarak Ele Alınması Durumunda Kestirilen Parametrelerin İncelenmesi
Göçer Şahin, Sakine (Eğitim Bilimleri Enstitüsü, 2016)This study investigates the errors resulted from unidimensional estimation of multidimensional semi-mixed structured tests. To this end, a research design was prepared to include various conditions such as test structure, ...