Approaches for addressing the fit of item response theory models to educational test data

The study was carried out to accomplish three goals : (1) Propose graphical displays of IRT model fit at the item level and suggest fit procedures at the test level that are not impacted by large sample size, (2) examine the impact of IRT model misfit on proficiency classifications, and (3) investig...

Full description

Bibliographic Details
Main Author:	Zhao, Yue
Language:	ENG
Published:	ScholarWorks@UMass Amherst 2008
Subjects:	Educational tests & measurements
Online Access:	https://scholarworks.umass.edu/dissertations/AAI3337019

id	ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-5347
record_format	oai_dc
spelling	ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-53472020-12-02T14:33:43Z Approaches for addressing the fit of item response theory models to educational test data Zhao, Yue The study was carried out to accomplish three goals : (1) Propose graphical displays of IRT model fit at the item level and suggest fit procedures at the test level that are not impacted by large sample size, (2) examine the impact of IRT model misfit on proficiency classifications, and (3) investigate consequences of model misfit in assessing academic growth. The main focus of the first goal was on the use of more and better graphical procedures for investigating model fit and misfit through the use of residuals and standardized residuals at the item level. In addition, some new graphical procedures and a non-parametric test statistic for investigating fit at the test score level were introduced, and some examples were provided. Based on a realistic dataset from a high school assessment, statistical and graphical methods were applied and results were reported. More important than the results about the actual fit, were the procedures that were developed and evaluated. In addressing the second goal, practical consequences of IRT model misfit on performance classifications and test score precision were examined. It was found that with several of the data sets under investigation, test scores were noticeably less well recovered with the misfitting model; and there were practically significant differences in the accuracy of classifications with the model that fit the data less well. In addressing the third goal, the consequences of model misfit in assessing academic growth in terms of test score precision, decision accuracy and passing rate were examined. The three-parameter logistic/graded response (3PL/GR) models produced more accurate estimates than the one-parameter logistic/partial credit (1PL/PC) models, and the fixed common item parameter method produced closer results to “truth” than linear equating using the mean and sigma transformation. IRT model fit studies have not received the attention they deserve among testing agencies and practitioners. On the other hand, IRT models can almost never provide a perfect fit to the test data, but the evidence is substantial that these models can provide an excellent framework for solving practical measurement problems. The importance of this study is that it provides ideas and methods for addressing model fit, and most importantly, highlights studies for addressing the consequences of model misfit for use in making determinations about the suitability of particular IRT models. 2008-01-01T08:00:00Z text https://scholarworks.umass.edu/dissertations/AAI3337019 Doctoral Dissertations Available from Proquest ENG ScholarWorks@UMass Amherst Educational tests & measurements
collection	NDLTD
language	ENG
sources	NDLTD
topic	Educational tests & measurements
spellingShingle	Educational tests & measurements Zhao, Yue Approaches for addressing the fit of item response theory models to educational test data
description	The study was carried out to accomplish three goals : (1) Propose graphical displays of IRT model fit at the item level and suggest fit procedures at the test level that are not impacted by large sample size, (2) examine the impact of IRT model misfit on proficiency classifications, and (3) investigate consequences of model misfit in assessing academic growth. The main focus of the first goal was on the use of more and better graphical procedures for investigating model fit and misfit through the use of residuals and standardized residuals at the item level. In addition, some new graphical procedures and a non-parametric test statistic for investigating fit at the test score level were introduced, and some examples were provided. Based on a realistic dataset from a high school assessment, statistical and graphical methods were applied and results were reported. More important than the results about the actual fit, were the procedures that were developed and evaluated. In addressing the second goal, practical consequences of IRT model misfit on performance classifications and test score precision were examined. It was found that with several of the data sets under investigation, test scores were noticeably less well recovered with the misfitting model; and there were practically significant differences in the accuracy of classifications with the model that fit the data less well. In addressing the third goal, the consequences of model misfit in assessing academic growth in terms of test score precision, decision accuracy and passing rate were examined. The three-parameter logistic/graded response (3PL/GR) models produced more accurate estimates than the one-parameter logistic/partial credit (1PL/PC) models, and the fixed common item parameter method produced closer results to “truth” than linear equating using the mean and sigma transformation. IRT model fit studies have not received the attention they deserve among testing agencies and practitioners. On the other hand, IRT models can almost never provide a perfect fit to the test data, but the evidence is substantial that these models can provide an excellent framework for solving practical measurement problems. The importance of this study is that it provides ideas and methods for addressing model fit, and most importantly, highlights studies for addressing the consequences of model misfit for use in making determinations about the suitability of particular IRT models.
author	Zhao, Yue
author_facet	Zhao, Yue
author_sort	Zhao, Yue
title	Approaches for addressing the fit of item response theory models to educational test data
title_short	Approaches for addressing the fit of item response theory models to educational test data
title_full	Approaches for addressing the fit of item response theory models to educational test data
title_fullStr	Approaches for addressing the fit of item response theory models to educational test data
title_full_unstemmed	Approaches for addressing the fit of item response theory models to educational test data
title_sort	approaches for addressing the fit of item response theory models to educational test data
publisher	ScholarWorks@UMass Amherst
publishDate	2008
url	https://scholarworks.umass.edu/dissertations/AAI3337019
work_keys_str_mv	AT zhaoyue approachesforaddressingthefitofitemresponsetheorymodelstoeducationaltestdata
_version_	1719364854993649664

Approaches for addressing the fit of item response theory models to educational test data

Similar Items