Application of the DIF-free-then-DIF Strategy to the Logistic Regression Method in Assessment of Differential Item Functioning

碩士 === 國立中正大學 === 心理學所 === 97 === The logistic regression (LR) method (French & Maller, 2007) is popular in assessing both uniform and nonuniform differential item functioning (DIF). To mitigate the effect of including DIF items in the matching variable on DIF assessment, scale purification proc...

Full description

Bibliographic Details
Main Authors: Hsin-Hao Chen, 陳信豪
Other Authors: Wen-Chung Wang
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/73384463600246175833
Description
Summary:碩士 === 國立中正大學 === 心理學所 === 97 === The logistic regression (LR) method (French & Maller, 2007) is popular in assessing both uniform and nonuniform differential item functioning (DIF). To mitigate the effect of including DIF items in the matching variable on DIF assessment, scale purification procedures have been developed. Unfortunately, these procedures are useful only when tests contain a few DIF items. If there are many DIF items, they could lose its control on Type I error rates of DIF assessment. The LR method can be viewed as an example of the all-other-item (AOI) method for the establishment of a common metric over groups of examinees. By definition, AOI is correct only when all items other than the studied one are indeed DIF-free. The DIF-free-then-DIF strategy (Wang, 2008), where a set of items that are the least likely to have DIF are identified and then they are treated as a anchor to assess DIF in the other items, has been proven to be more appropriate than AOI in terms of better control of Type I error rate and higher and power, when tests contains many DIF items. The purpose of this study was to implement the DIF-free-then-DIF strategy to the standard LR method (denoted as ST), which is called the LR method with a pure anchor (PA) and compare their performance in uniform DIF assessment of dichotomous items with the LR method with scale purification (SP) through a series of simulations. Six independent variables were manipulated: (a) DIF detection method; (b) test length; (c) sample size; (d) DIF magnitude; (e) percentage of DIF items in a test; and (f) mean ability differences between groups. The results showed when there were many DIF items in the test, only PA can yield a well-controlled Type I error rate and a high power. However, there was an interaction effect between ability differences between groups and DIF detection method. That is, the choice of DIF detection method should consider the ability difference. This study suggests Cohen’s D value for estimating the ability differences.