Dieses Ergebnis aus ERIC kann Gästen nicht angezeigt werden. Login für vollen Zugriff.

Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes. EdWorkingPaper No. 25-1173

Title:	Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes. EdWorkingPaper No. 25-1173
Language:	English
Authors:	Joshua B. Gilbert (ORCID 0000-0003-3496-2710); Zachary Himmelsbach (ORCID 0000-0002-5444-0648); Luke W. Miratrix (ORCID 0000-0002-0078-1906); Andrew D. Ho (ORCID 0000-0003-1287-9844); Benjamin W. Domingue (ORCID 0000-0002-3894-9049); Annenberg Institute for School Reform at Brown University
Source:	Annenberg Institute for School Reform at Brown University. 2025.
Availability:	Annenberg Institute for School Reform at Brown University. Brown University Box 1985, Providence, RI 02912. Tel: 401-863-7990; Fax: 401-863-1290; e-mail: annenberg@brown.edu; Web site: https://annenberg.brown.edu/
Peer Reviewed:	N
Page Count:	56
Publication Date:	2025
Sponsoring Agency:	Institute of Education Sciences (ED)
Contract Number:	R305D240025
Document Type:	Reports - Research
Education Level:	Secondary Education
Descriptors:	Value Added Models; Reliability; Effect Size; Test Items; Generalizability Theory; Foreign Countries; Secondary School Students; Secondary School Teachers
Geographic Terms:	Tanzania
Abstract:	Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how estimates would differ had the test included alternative items. We introduce a model that estimates the magnitude of item-by-teacher/school variance accurately, revealing that standard VAMs overstate reliability and overestimate differences between units. Using a case study and 41 measures from 25 studies with item-level outcome data, we show how standard VAMs overstate reliability by an average of 0.12 on the 0-1 reliability scale (median = 0.09, SD = 0.13) and provide standard deviations of teacher/school effects that are on average 22% too large (median = 7%, SD = 41%). We discuss how imprecision due to heterogeneous VA effects across items attenuates effect sizes, obfuscates comparisons across studies, and causes instability over time. Our results suggest that accurate estimation and interpretation of VAMs requires item-level data, including qualitative data about how items represent the content domain.
Abstractor:	As Provided
IES Funded:	Yes
Entry Date:	2025
Accession Number:	ED674059
Database:	ERIC