Data Simulation Using Unidimensional Item Response Theory

Yeşim Beril Soğuksu; Hatice Gürdil

doi:10.37609/akya.4116

Yazarlar

Yeşim Beril Soğuksu

https://orcid.org/0009-0004-0870-4974

Hatice Gürdil

https://orcid.org/0000-0002-0079-3202

DOI: https://doi.org/10.37609/akya.4116.c6402

Özet

Bu bölümde, Tek Boyutlu Madde Tepki Kuramı (MTK) çerçevesinde veri simülasyonu süreçleri kuramsal ve uygulamalı bir bütünlükle ele alınmaktadır. Bölüm kapsamında iki kategorili ve çok kategorili puanlanan maddeler için MTK modellerinin matematiksel altyapıları sunulmaktadır. Ayrıca MTK’nın temel varsayımları, parametre kestirim teknikleri ve Monte Carlo simülasyon çalışmalarında izlenmesi gereken adımlar detaylandırılmıştır. Uygulama aşamasında, örnek bir araştırma sorusu dikkate alınarak MTK kapsamında bir Monte Carlo simülasyon çalışmasının R programlama dili kullanılarak nasıl yürütüldüğü tüm aşamalarıyla sunulmuştur. Bu doğrultuda örnek araştırma sorusuna yönelik simülasyon deseninin oluşturulması, farklı test uzunlukları ve örneklem büyüklükleri dikkate alınarak veri setlerinin üretilmesi, üretilen veri setlerine yönelik simülasyon geçerliği çalışmalarının yürütülmesi, madde ve yetenek parametresi kestirimleri, parametrelere yönelik yanlılık (bias) ve kök ortalama kare hatası (RMSE) hesaplamaları, bulguların görsel hale getirilmesi ilgili R kodları sunularak ele alınmıştır. Sonuç olarak bu bölüm, araştırmacılar için MTK tabanlı Monte Carlo simülasyon çalışmalarına yönelik uygulamalı bir metodolojik rehber sunmaktadır.

This chapter addresses data simulation processes within the framework of Unidimensional Item Response Theory (IRT) by integrating both theoretical and applied perspectives. Within this scope, the mathematical foundations of IRT models for dichotomously and polytomously scored items are presented. In addition, the fundamental assumptions of IRT, parameter estimation techniques, and the key steps to be followed in Monte Carlo simulation studies are explained. In the application section, an example research question is used to demonstrate how a Monte Carlo simulation study within the IRT framework can be conducted using the R programming language, with all stages presented step by step. In this context, the construction of the simulation design based on the research question, the generation of datasets under different test lengths and sample sizes, the implementation of simulation validity checks, the estimation of item and ability parameters, the calculation of bias and Root Mean Square Error (RMSE) for parameter estimates, and the visualization of findings are addressed through the presentation of relevant R codes. In conclusion, this chapter provides researchers with an applied methodological guide for conducting IRT-based Monte Carlo simulation studies.

Referanslar

Baker, F. B. (2016). Madde tepki kuramının temelleri (M. İlhan, Çev.). Ankara: Pegem Akademi.

Bock, R. D. & Aitkin M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459. https://doi.org/10.1007/BF02293801

Bock, R. D. & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431-444. https://doi.org/10.1177/014662168200600405

Bond, T. G. & Fox, C. M. (2007). Applying the Rasch model. Fundamental measurement in the human sciences. New York: Routledge.

Bulut, O., & Sünbül, Ö. (2017). R programlama dili ile madde tepki kuramında monte carlo simülasyon çalışmaları. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287.

Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. https://doi.org/10.18637/jss.v048.i06

Dai, S., Wang, X. & Svetina, D. (2022). subscore: Computing subscores in classical test theory and item response theory. Erişim adresi: https://CRAN.R-project.org/package=subscore.

De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.

DeMars, C. (2010). Item response theory: Understanding statistics measurement. Oxford University Press.

Dinno, A. (2025). paran: Horn's test of principal components/factors (R package version 1.5.4). https://CRAN.R-project.org/package=paran

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.

Feinberg, R. A. & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49. https://doi.org/10.1111/emip.1211

Gürdil, H., Soğuksu, Y. B., & Salihoğlu, S., ve diğ., (2025). Eğitimde Ölçmede Yapay Zekanın Entegrasyonu: Madde Tepki Kuramı Kapsamında Veri Üretiminde ChatGPT'nin Etkililiği. Trakya Journal of Education, 15(2).

Hambleton, R. K., Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. California: Sage Publications.

Harwell, M., Stone, C. A., Hsu, T. C. & Kirisci, L. (1996). Monte carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101-125. https://doi.org/10.1177/014662169602000201

Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. https://doi.org/10.1007/BF02289447

Masters, G. N. (1982). A rasch model for partial credit scoring. Psychometrika 47, 149–174. https://doi.org/10.1007/BF02296272

Maydeu-Olivares, A. & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrica, 71, 713-732. https://doi.org/10.2139/ssrn.1016131

Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–176. https://doi.org/10.1177/014662169201600206

Pekmezci, F. B. & Avşar, A. Ş. (2021). A guide for more accurate and precise estimations in simulative unidimensional IRT models. International Journal of Assessment Tools in Education, 8(2), 423-453. https://doi.org/10.21449/ijate.790289

Revelle, W. (2022). psych: Procedures for psychological, psychometric, and personality research. doi: https://cran.r-project.org/web/packages/psych/index.html

Robitzsch, A. (2022). sirt: Supplementary item response theory models. Erişim adresi: https://cran.r-project.org/web/packages/sirt/index.html

Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34, 1-97. https://doi.org/10.1007/BF03372160

Spence, I. (1983). Monte carlo simulation studies. Applied Psychological Measurement, 7, 405-425. https://doi.org/10.1177/014662168300700403

Stone, C. A. (1993). The use of multiple replications in IRT based Monte Carlo research. Paper presented at the European Meeting of the Psychometric Society, Barcelona.

Thissen, D., Steinberg, L. & Gerrard, M. (1986). Beyond group-mean differences: The concept of item bias. Psychological Bulletin, 99(1), 118–128. https://doi.org/10.1037/0033-2909.99.1.118

Wickham, H. & Bryan, J. (2023). readxl: Read Excel Files. https://readxl.tidyverse.org , https://github.com/tidyverse/readxl.

Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.

Wright, B. D. & Masters, G. N. (1982). Rating scale analysis: Rasch measurement. Chicago: Mesa Press.

Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125- 145. https://doi.org/10.1177/014662168400800201