Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses

Abstract Measures of cognitive or socio-emotional skills from large-scale assessments surveys (LSAS) are often based on advanced statistical models and scoring techniques unfamiliar to applied researchers. Consequently, applied researchers working with data from LSAS may be uncertain about the assum...

Full description

Bibliographic Details
Main Authors:	Clemens M. Lechner, Nivedita Bhaktha, Katharina Groskurth, Matthias Bluemke
Format:	Article
Language:	English
Published:	BMC 2021-01-01
Series:	Measurement Instruments for the Social Sciences
Subjects:	Large-scale assessments Measurement error Test scores Plausible values
Online Access:	https://doi.org/10.1186/s42409-020-00020-5

id	doaj-907514f612644f8389ec0221f088bf50
record_format	Article
spelling	doaj-907514f612644f8389ec0221f088bf502021-01-24T12:15:06ZengBMCMeasurement Instruments for the Social Sciences2523-89302021-01-013111610.1186/s42409-020-00020-5Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analysesClemens M. Lechner0Nivedita Bhaktha1Katharina Groskurth2Matthias Bluemke3Department of Survey Design and Methodology, GESIS – Leibniz Institute for the Social SciencesDepartment of Survey Design and Methodology, GESIS – Leibniz Institute for the Social SciencesDepartment of Survey Design and Methodology, GESIS – Leibniz Institute for the Social SciencesDepartment of Survey Design and Methodology, GESIS – Leibniz Institute for the Social SciencesAbstract Measures of cognitive or socio-emotional skills from large-scale assessments surveys (LSAS) are often based on advanced statistical models and scoring techniques unfamiliar to applied researchers. Consequently, applied researchers working with data from LSAS may be uncertain about the assumptions and computational details of these statistical models and scoring techniques and about how to best incorporate the resulting skill measures in secondary analyses. The present paper is intended as a primer for applied researchers. After a brief introduction to the key properties of skill assessments, we give an overview over the three principal methods with which secondary analysts can incorporate skill measures from LSAS in their analyses: (1) as test scores (i.e., point estimates of individual ability), (2) through structural equation modeling (SEM), and (3) in the form of plausible values (PVs). We discuss the advantages and disadvantages of each method based on three criteria: fallibility (i.e., control for measurement error and unbiasedness), usability (i.e., ease of use in secondary analyses), and immutability (i.e., consistency of test scores, PVs, or measurement model parameters across different analyses and analysts). We show that although none of the methods are optimal under all criteria, methods that result in a single point estimate of each respondent’s ability (i.e., all types of “test scores”) are rarely optimal for research purposes. Instead, approaches that avoid or correct for measurement error—especially PV methodology—stand out as the method of choice. We conclude with practical recommendations for secondary analysts and data-producing organizations.https://doi.org/10.1186/s42409-020-00020-5Large-scale assessmentsMeasurement errorTest scoresPlausible values
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Clemens M. Lechner Nivedita Bhaktha Katharina Groskurth Matthias Bluemke
spellingShingle	Clemens M. Lechner Nivedita Bhaktha Katharina Groskurth Matthias Bluemke Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses Measurement Instruments for the Social Sciences Large-scale assessments Measurement error Test scores Plausible values
author_facet	Clemens M. Lechner Nivedita Bhaktha Katharina Groskurth Matthias Bluemke
author_sort	Clemens M. Lechner
title	Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses
title_short	Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses
title_full	Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses
title_fullStr	Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses
title_full_unstemmed	Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses
title_sort	why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses
publisher	BMC
series	Measurement Instruments for the Social Sciences
issn	2523-8930
publishDate	2021-01-01
description	Abstract Measures of cognitive or socio-emotional skills from large-scale assessments surveys (LSAS) are often based on advanced statistical models and scoring techniques unfamiliar to applied researchers. Consequently, applied researchers working with data from LSAS may be uncertain about the assumptions and computational details of these statistical models and scoring techniques and about how to best incorporate the resulting skill measures in secondary analyses. The present paper is intended as a primer for applied researchers. After a brief introduction to the key properties of skill assessments, we give an overview over the three principal methods with which secondary analysts can incorporate skill measures from LSAS in their analyses: (1) as test scores (i.e., point estimates of individual ability), (2) through structural equation modeling (SEM), and (3) in the form of plausible values (PVs). We discuss the advantages and disadvantages of each method based on three criteria: fallibility (i.e., control for measurement error and unbiasedness), usability (i.e., ease of use in secondary analyses), and immutability (i.e., consistency of test scores, PVs, or measurement model parameters across different analyses and analysts). We show that although none of the methods are optimal under all criteria, methods that result in a single point estimate of each respondent’s ability (i.e., all types of “test scores”) are rarely optimal for research purposes. Instead, approaches that avoid or correct for measurement error—especially PV methodology—stand out as the method of choice. We conclude with practical recommendations for secondary analysts and data-producing organizations.
topic	Large-scale assessments Measurement error Test scores Plausible values
url	https://doi.org/10.1186/s42409-020-00020-5
work_keys_str_mv	AT clemensmlechner whyabilitypointestimatescanbepointlessaprimeronusingskillmeasuresfromlargescaleassessmentsinsecondaryanalyses AT niveditabhaktha whyabilitypointestimatescanbepointlessaprimeronusingskillmeasuresfromlargescaleassessmentsinsecondaryanalyses AT katharinagroskurth whyabilitypointestimatescanbepointlessaprimeronusingskillmeasuresfromlargescaleassessmentsinsecondaryanalyses AT matthiasbluemke whyabilitypointestimatescanbepointlessaprimeronusingskillmeasuresfromlargescaleassessmentsinsecondaryanalyses
_version_	1724326152989835264

Why ability point estimates can be pointless: a primer on using skill measures from large-scale assessments in secondary analyses

Similar Items