Approaches for addressing the fit of item response theory models to educational test data

The study was carried out to accomplish three goals : (1) Propose graphical displays of IRT model fit at the item level and suggest fit procedures at the test level that are not impacted by large sample size, (2) examine the impact of IRT model misfit on proficiency classifications, and (3) investig...

Full description

Bibliographic Details
Main Author: Zhao, Yue
Language:ENG
Published: ScholarWorks@UMass Amherst 2008
Subjects:
Online Access:https://scholarworks.umass.edu/dissertations/AAI3337019
id ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-5347
record_format oai_dc
spelling ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-53472020-12-02T14:33:43Z Approaches for addressing the fit of item response theory models to educational test data Zhao, Yue The study was carried out to accomplish three goals : (1) Propose graphical displays of IRT model fit at the item level and suggest fit procedures at the test level that are not impacted by large sample size, (2) examine the impact of IRT model misfit on proficiency classifications, and (3) investigate consequences of model misfit in assessing academic growth. The main focus of the first goal was on the use of more and better graphical procedures for investigating model fit and misfit through the use of residuals and standardized residuals at the item level. In addition, some new graphical procedures and a non-parametric test statistic for investigating fit at the test score level were introduced, and some examples were provided. Based on a realistic dataset from a high school assessment, statistical and graphical methods were applied and results were reported. More important than the results about the actual fit, were the procedures that were developed and evaluated. In addressing the second goal, practical consequences of IRT model misfit on performance classifications and test score precision were examined. It was found that with several of the data sets under investigation, test scores were noticeably less well recovered with the misfitting model; and there were practically significant differences in the accuracy of classifications with the model that fit the data less well. In addressing the third goal, the consequences of model misfit in assessing academic growth in terms of test score precision, decision accuracy and passing rate were examined. The three-parameter logistic/graded response (3PL/GR) models produced more accurate estimates than the one-parameter logistic/partial credit (1PL/PC) models, and the fixed common item parameter method produced closer results to “truth” than linear equating using the mean and sigma transformation. IRT model fit studies have not received the attention they deserve among testing agencies and practitioners. On the other hand, IRT models can almost never provide a perfect fit to the test data, but the evidence is substantial that these models can provide an excellent framework for solving practical measurement problems. The importance of this study is that it provides ideas and methods for addressing model fit, and most importantly, highlights studies for addressing the consequences of model misfit for use in making determinations about the suitability of particular IRT models. 2008-01-01T08:00:00Z text https://scholarworks.umass.edu/dissertations/AAI3337019 Doctoral Dissertations Available from Proquest ENG ScholarWorks@UMass Amherst Educational tests & measurements
collection NDLTD
language ENG
sources NDLTD
topic Educational tests & measurements
spellingShingle Educational tests & measurements
Zhao, Yue
Approaches for addressing the fit of item response theory models to educational test data
description The study was carried out to accomplish three goals : (1) Propose graphical displays of IRT model fit at the item level and suggest fit procedures at the test level that are not impacted by large sample size, (2) examine the impact of IRT model misfit on proficiency classifications, and (3) investigate consequences of model misfit in assessing academic growth. The main focus of the first goal was on the use of more and better graphical procedures for investigating model fit and misfit through the use of residuals and standardized residuals at the item level. In addition, some new graphical procedures and a non-parametric test statistic for investigating fit at the test score level were introduced, and some examples were provided. Based on a realistic dataset from a high school assessment, statistical and graphical methods were applied and results were reported. More important than the results about the actual fit, were the procedures that were developed and evaluated. In addressing the second goal, practical consequences of IRT model misfit on performance classifications and test score precision were examined. It was found that with several of the data sets under investigation, test scores were noticeably less well recovered with the misfitting model; and there were practically significant differences in the accuracy of classifications with the model that fit the data less well. In addressing the third goal, the consequences of model misfit in assessing academic growth in terms of test score precision, decision accuracy and passing rate were examined. The three-parameter logistic/graded response (3PL/GR) models produced more accurate estimates than the one-parameter logistic/partial credit (1PL/PC) models, and the fixed common item parameter method produced closer results to “truth” than linear equating using the mean and sigma transformation. IRT model fit studies have not received the attention they deserve among testing agencies and practitioners. On the other hand, IRT models can almost never provide a perfect fit to the test data, but the evidence is substantial that these models can provide an excellent framework for solving practical measurement problems. The importance of this study is that it provides ideas and methods for addressing model fit, and most importantly, highlights studies for addressing the consequences of model misfit for use in making determinations about the suitability of particular IRT models.
author Zhao, Yue
author_facet Zhao, Yue
author_sort Zhao, Yue
title Approaches for addressing the fit of item response theory models to educational test data
title_short Approaches for addressing the fit of item response theory models to educational test data
title_full Approaches for addressing the fit of item response theory models to educational test data
title_fullStr Approaches for addressing the fit of item response theory models to educational test data
title_full_unstemmed Approaches for addressing the fit of item response theory models to educational test data
title_sort approaches for addressing the fit of item response theory models to educational test data
publisher ScholarWorks@UMass Amherst
publishDate 2008
url https://scholarworks.umass.edu/dissertations/AAI3337019
work_keys_str_mv AT zhaoyue approachesforaddressingthefitofitemresponsetheorymodelstoeducationaltestdata
_version_ 1719364854993649664