ITEMS Portal
Digital Module 18: Automated Scoring
5 (2 votes)
Recorded On: 12/04/2020
-
Register
- Learner - Free!
In this digital ITEMS module, Dr. Sue Lottridge, Amy Burkhardt, and Dr. Michelle Boyer provide an overview of automated scoring. Automated scoring is the use of computer algorithms to score unconstrained open-ended test items by mimicking human scoring. The use of automated scoring is increasing in educational assessment programs because it allows scores to be returned faster at lower cost. In the module, they discuss automated scoring from a number of perspectives. First, they discuss benefits and weaknesses of automated scoring and what psychometricians should know about automated scoring. Next, they describe the overall process of automated scoring, moving from data collection to engine training to operational scoring. Then, they describe how automated scoring systems work, including the basic functions around score prediction as well as other flagging methods. Finally, they conclude with a discussion of the specific validity demands around automated scoring and how they align with the larger validity demands around test scores. Two data activities are provided. The first is an interactive activity that allows the user to train and evaluate a simple automated scoring engine. The second is a worked example that examines the impact of rater error on test scores. The digital module contains a link to an interactive web application as well as its R-Shiny code, diagnostic quiz questions, activities, curated resources, and a glossary.
Key words: automated scoring, hand-scoring, machine learning, natural language processes, constructed response items
Susan Lottridge
Cambium Assessment
Sue Lottridge, Ph.D. is a Senior Director of Automated Scoring at the Cambium Assessment, Inc. (CAI). In this role, she leads CAI’s machine learning and scoring team on the research, development, and operation of CAI’s automated scoring software. This software includes automated essay scoring, short answer scoring, automated speech scoring, and an engine that detects disturbing content in student responses. Dr. Lottridge has worked in automated scoring for twelve years and has contributed to the design, research, and use of multiple automated scoring engines including equation scoring, essay scoring, short answer scoring, alert detection, and dialogue systems. She earned her Ph.D. from James Madison University in assessment and measurement (2006) and holds Masters’ degrees in Mathematics and in Computer Science from the University of Wisconsin-Madison (1997).
Contact Sue via susanlottridge@hotmail.com
Amy Burkhardt
University of Colorado - Boulder
Amy Burkhardt is a PhD Candidate in Research and Evaluation Methodology with an emphasis in Human Language Technology at the University of Colorado, Boulder. She has been involved in the development of two automated scoring systems. Ongoing research projects include the automatic detection of students reporting harm within online tests, the use of machine learning to explore public discourse around educational policies, and considerations in psychometric modeling when making diagnostic inferences aligned to a learning progression.
Contact Amy via amy.burkhardt@colorado.edu
Michelle Boyer
Center for Assessment
Michelle Boyer, Ph.D. is a Senior Associate at The National Center for the Improvement of Educational Assessment, Inc. Dr. Boyer consults with states and organizations on such issues as assessment systems, validity of score interpretations, scoring design and evaluation criteria for both human and automated scoring, assessment literacy, and score comparability. She is also a regular contributor to professional publications and the annual conferences of AERA, NCME, and CCSSO. Her most recent research focuses on evaluating the quality of automated scoring and its impact test score scales and test equating solutions. Dr. Boyer earned her Ph.D. from the University of Massachusetts, Amherst in Research, Educational Measurement, and Psychometrics (2018).
Contact Michelle via mboyer@nciea.org