Full Catalog

Digital Module 01: Reliability in Classical Test Theory
​In this digital ITEMS module, Dr. Charlie Lewis and Dr. Michael Chajewski provide a two-part introduction to the topic of reliability from the perspective of classical test theory (CTT). Keywords: classical test theory, CTT, congeneric, KR-20, KR-21, Cronbach’s alpha, Pearson correlation, reliability, ​Spearman-Brown formula, parallel, tau-equivalent, test-retest, validity
Digital Module 02: Scale Reliability in Structural Equation Modeling
​In this digital ITEMS module, Dr. Greg Hancock and Dr. Ji An provide an overview of scale reliability from the perspective of structural equation modeling (SEM) and address some of the limitations of Cronbach’s α. Keywords: congeneric, Cronbach’s alpha, ​reliability, scale reliability, SEM, s​tructural equation modeling, McDonald’s omega, model fit, parallel, tau-equivalent
Digital Module 03: Nonparametric Item Response Theory
In this digital ITEMS module Dr. Stefanie Wind introduces the framework of nonparametric item response theory (IRT), in particular Mokken scaling, which can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. Keywords: ​double monotonicity model, DMM, ​item response theory, IRT, Mokken scaling, monotone homogeneity model, multilevel modeling, mokken package, nonparametric IRT, ​R​, rater effects
Digital Module 04: Diagnostic Measurement Checklists
​In this digital ITEMS module, Dr. Natacha Carragher, Dr. Jonathan Templin, and colleagues provide a didactic overview of the specification, estimation, evaluation, and interpretation steps for diagnostic measurement / classification models (DCMs) centered around checklists for practitioners. A library of macros and supporting files for Excel, SAS, and Mplus is provided along with video tutorials for key practices. ​Keywords: attributes, ​checklists, ​diagnostic measurement, diagnostic classification models, DCM, Excel, ​log-linear cognitive diagnosis modeling framework, LCDM, Mplus, ​Q-matrix, model fit, SAS
Digital Module 05: The G-DINA Framework
In this digital ITEMS module, Dr. Wenchao Ma and Dr. Jimmy de la Torre introduce the G-DINA model, which is a general framework for specifying, estimating, and evaluating a wide variety of cognitive diagnosis models for the purpose of diagnostic measurement. Keywords: cognitive diagnosis models, CDM, diagnostic classification models, ​DCM, diagnostic measurement, ​GDINA, G-DINA framework, GDINA package, model fit, model comparison, Q-matrix, validation
Digital Module 06: Posterior Predictive Model Checking
​In this digital ITEMS module, Dr. Allison Ames and Aaron Myers ​discuss the most common Bayesian approach to model-data fit evaluation, which is called Posterior Predictive Model Checking (PPMC), for simple linear regression and item response theory models. Keywords: Bayesian inference, simple linear regression, item response theory, IRT, model fit, posterior predictive model checking, PPMC, Bayes theorem, Yen’s Q3, item fit
Digital Module 07: Subscore Evaluation & Reporting
In this digital ITEMS module, Dr. Sandip Sinharay reviews the status quo on the reporting of subscores, which includes how they are used in operational reporting, what kinds of professional standards they need to meet, and how their psychometric properties can be evaluated. Keywords: Diagnostic scores, disattenuation, DETECT, DIMTEST, factor analysis, multidimensional item response theory (MIRT), proportional reduction in mean squared error (PRMSE), reliability, subscores
Digital Module 08: Foundations of Operational Item Analysis
In this digital ITEMS module, Dr. Hanwook Yoo and Dr. Ronald K. Hambleton provide an accessible overview of operational item analysis approaches for dichotomously scored items within the frameworks of classical test theory and item response theory. Keywords: Classical test theory, CTT, corrections, difficulty, discrimination, distractors, item analysis, item response theory, operations, R Shiny, TAP, test development
Digital Module 09: Sociocognitive Assessment for Diverse Populations
In this digital ITEMS module, Dr. Robert Mislevy and Dr. Maria Elena Oliveri introduce and illustrate a sociocognitive perspective on educational measurement, which focuses on a variety of design and implementation considerations for creating fair and valid assessments for learners from diverse populations with diverse sociocultural experiences. Keywords: assessment design, Bayesian statistics, cross-cultural assessment, diverse populations, educational measurement, evidence-centered design, fairness, international assessments, prototype, reliability, sociocognitive assessment, validity
Digital Module 10: Rasch Measurement Theory
In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales and demonstrate the estimation of core models with the Shiny_ERMA and Winsteps programs. Keywords: invariance, item fit, item response theory, IRT, person fit, model fit, multi-faceted Rasch model, objective measurement, R, Rasch measurement, Shiny_ERMA, Winsteps
Digital Module 11: Bayesian Psychometrics
In this digital ITEMS module, Dr. Roy Levy discusses how Bayesian inference is a mechanism for reasoning in probability-modeling framework, describes how this plays out in a normal distribution model and unidimensional item response theory (IRT) models, and illustrates these steps using the JAGS software and R. Keywords: Bayesian psychometrics, Bayes theorem, dichotomous data, item response theory (IRT), JAGS, Markov-chain Monte Carlo (MCMC) estimation, normal distribution, R, unidimensional models
Digital Module 12: Think-aloud Interviews and Cognitive Labs
​In this digital ITEMS module, Dr. Jacqueline Leighton and Dr. Blair Lehman review differences between think-aloud interviews to measure problem-solving processes and cognitive labs to measure comprehension processes and illustrate both traditional and modern data-collection methods. Keywords: ABC tool, cognitive laboratory, cog lab, cognition, cognitive model, interrater agreement, kappa, probe, rubric, thematic analysis, think-aloud interview, verbal report
Digital Module 13: Simulation Studies in IRT
In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of Monte Carlo simulation studies (MCSS) in item response theory (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because they allow researchers to specify and manipulate an array of parameter values and experimental conditions (e.g., sample size, test length, and test characteristics). Key words: bias, bi-factor model, estimation, graded response model, item response theory, mean squared error, Monte Carlo, simulation, standard error, two-parameter logistic model
Digital Module 14: Planning and Conducting Standard Setting
In this digital ITEMS module, Dr. Michael B. Bunch provides an in-depth, step-by-step look at how standard setting is done. It does not focus on any specific procedure or methodology (e.g., modified Angoff, bookmark, body of work) but on the practical tasks that must be completed for any standard setting activity. Keywords: achievement level descriptor, certification and licensure, cut score, feedback, interquartile range, performance level descriptor, score reporting, standard setting, panelist, vertical articulation
Digital Module 15: Accessibility of Educational Assessments
In this digital ITEMS module, Dr. Ketterlin Geller and her colleagues provide an introduction to accessibility of educational assessments. They discuss the legal basis for accessibility in K-12 and higher education organizations and describe how test and item design features as well as examinee characteristics affect the role that accessibility plays in evaluating test validity during test development operational deployment. Keywords: Accessibility, accommodations, examinee characteristics, fairness, higher education, K-12 education, item design, legal guidelines, test development, universal design