All-access Pass

5 (12 votes)

This provides immediate access to ALL print and digital modules in the portal by "registering" you for each and displaying all modules as a single collection as part of this pass.

 

Search by Category
Search by Format
Sort By
  • Contains 8 Component(s) Recorded On: 04/03/2020

    In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of Monte Carlo simulation studies (MCSS) in item response theory (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because they allow researchers to specify and manipulate an array of parameter values and experimental conditions (e.g., sample size, test length, and test characteristics). Key words: bias, bi-factor model, estimation, graded response model, item response theory, mean squared error, Monte Carlo, simulation, standard error, two-parameter logistic model

    In this digital ITEMS module, Dr. Brian Leventhal and Dr. Allison Ames provide an overview of Monte Carlo simulation studies (MCSS) in item response theory (IRT). MCSS are utilized for a variety of reasons, one of the most compelling being that they can be used when analytic solutions are impractical or nonexistent because they allow researchers to specify and manipulate an array of parameter values and experimental conditions (e.g., sample size, test length, and test characteristics). Dr. Leventhal and Dr. Ames review the conceptual foundation of MCSS in IRT and walk through the processes of simulating total scores as well as item responses using the two-parameter logistic, graded response, and bi-factor models. They provide guidance for how to implement MCSS using other item response models and best practices for efficient syntax and executing an MCSS. The digital module contains sample SAS code, diagnostic quiz questions, activities, curated resources, and a glossary.

    Key words: bias, bi-factor model, estimation, graded response model, item response theory, mean squared error, Monte Carlo, simulation, standard error, two-parameter logistic model

    Brian Leventhal

    Assistant Professor

    Brian is an assistant professor in the Assessment and Measurement PhD program in the Department of Graduate Psychology at James Madison University as well as an assistant assessment specialist in the Center for Assessment and Research Studies at James Madison University. There, he teaches courses in quantitative methods, including a course on Simulation Studies in Psychometrics. Brian received his Ph.D. from the University of Pittsburgh. His research interests include multidimensional item response models that account for response styles, response process models, and classification errors in testing. Brian is passionate about teaching and providing professional development for graduate students and early-career practitioners. He has thoroughly enjoyed collaborating with Allison Ames and the Instructional Design Team to develop this module. 

    Contact Brian via leventbc@jmu.edu

    Allison J. Ames

    Assistant Professor

    Allison is an assistant professor in the Educational Statistics and Research Methods program in the Department of Rehabilitation, Human Resources and Communication Disorders, Research Methodology, and Counseling at the University of Arkansas. There, she teaches courses in educational statistics, including a course on Bayesian inference. Allison received her Ph.D. from the University of North Carolina at Greensboro. Her research interests include Bayesian item response theory, with an emphasis on prior specification; model-data fit; and models for response processes. Her research has been published in prominent peer-reviewed journals. She enjoyed collaborating on this project with a graduate student, senior faculty member, and the Instructional Design Team.
    Contact Allison via boykin@uark.edu

  • Product not yet rated Contains 5 Component(s) Recorded On: 02/25/2022

    In this digital ITEMS module, Drs. Richard Feinberg, Carol Morrison, and Mark R. Raymond discuss an overview of how credentialing testing programs operate and special considerations that need to be made when unusual things occur. Keywords: Credential/Licensure Testing, Assessment Design, Assessment Challenges, Threats to Score Validity, Operational Psychometrics

    In this digital ITEMS module, Drs. Richard Feinberg, Carol Morrison, and Mark R. Raymond discuss an overview of how credentialing testing programs operate and special considerations that need to be made when unusual things occur.  Formal graduate education in a measurement related field provides a solid foundation for professionals who work on credentialing examinations. Those foundational skills are then expanded and refined over time as practitioners encounter complex and nuanced challenges that were not covered by or go beyond the context described in textbooks. For instance, as most of us who work on operational testing programs are (sometimes) painfully aware, real data can be very messy. Often unanticipated situations arise that can create a range of problems from threats to score validity, to unexpected financial costs, and even longer-term reputational damage. In practice, solutions for these unanticipated situations are not always straightforward, often requiring a compromise between psychometric best practices, business resources, and needs of the customer. In this module we discuss some of these unusual challenges that usually occur in a credentialing program. First, we provide a high-level summary of the main components of the assessment lifecycle and the different roles within a testing organization. Next, we propose a framework for qualifying risk along with various considerations and potential actions for managing these challenges. Lastly, we integrate this information by presenting a few scenarios that can occur in practice that should help learners think though applicable team-based problem-solving and better align recommended action from a psychometric perspective given the context and magnitude of the challenge.


    Keywords:Credential/Licensure Testing, Assessment Design, Assessment Challenges, Threats to Score Validity, Operational Psychometrics

    Richard Feinberg

    Senior Psychometrician

    National Board of Medical Examiners

    Contact Richard Feinberg via RFeinberg@nbme.org

    Carol Morrison

    Principal Psychometrician

    National Board of Medical Examiners

    Contact Carol Morrison: CMorrison@nbme.org

    Mark R. Raymond

    Director of Assessment Design and Delivery

    National Conference of Bar Examiners

    Contact Mark Raymond: MRaymond@ncbex.org

  • Contains 6 Component(s) Recorded On: 11/30/2021

    In this digital ITEMS module, Dr. Jodi M. Casabianca provides a primer on the hierarchical rater model (HRM) and the recent expansions to the model for analyzing raters and ratings of constructed responses. Keywords: hierarchical rater model, item response theory, longitudinal models, rater effects, multidimensional models, signal detection models

    In this digital ITEMS module, Dr. Jodi M. Casabianca provides a primer on the hierarchical rater model (HRM) and the recent expansions to the model for analyzing raters and ratings of constructed responses. In the first part of the module, she establishes an understanding of the nature of constructed responses, the rating process and the common rater errors or rater effects originating from the rating process. In the second part of the module, she compares traditional measures for analyzing raters to item response theory (IRT) measures and discusses various IRT rater models. In the third section, she discusses longitudinal and multidimensional extensions and their foundations in different combinations of IRT and signal detection models. The module contains audio-narrated slides, quizzes with feedback, and additional resources such as a glossary and reference documents.

    Keywords: hierarchical rater model, item response theory, longitudinal models, rater effects, multidimensional models, signal detection models 

    Jodi M. Casabianca

    Educational Testing Service

    Jodi M. Casabianca is a measurement scientist at ETS. Jodi received a B.A. in both Statistics and Psychology (2001) and an M.S. in Applied and Mathematical Statistics (2004) from Rutgers University as well as an M.A. (2008) and a Ph.D. (2011) in Psychometrics and Quantitative Psychology from Fordham University. She was the 2009 recipient of the Harold Gulliksen Dissertation Research Fellow awarded by ETS. Before joining ETS in 2016, she was an assistant professor at the University of Texas at Austin’s Department of Educational Psychology where she taught graduate courses and led a research lab in the Quantitative Methods program. Dr. Casabianca was awarded a National Science Foundation grant in 2013 (as co-PI, with Professor  Junker) to expand the HRM into a framework that is more flexible for different assessment scenarios and has provided two conference workshops on the HRM in 2017 and 2018. Her current research focuses on statistical and psychometric modeling for constructed responses, with a focus on the evaluation of automated scoring models. Her research has been published in several notable peer reviewed journals, including the Journal of Educational and Behavioral Statistics and the Journal of Educational Measurement

  • Contains 5 Component(s)

    In this digital ITEMS module, Dr. Katherine Reynolds and Dr. Sebastian Moncaleano discuss content alignment, its role in standards-based educational assessment, and popular methods for conducting alignment studies. Key words: Achieve methodology, content alignment, content area standards, content validity, standards-based assessment, Surveys of Enacted Curriculum, Webb methodology

    In this digital ITEMS module, Dr. Katherine Reynolds and Dr. Sebastian Moncaleano discuss content alignment, its role in standards-based educational assessment, and popular methods for conducting alignment studies. The module begins with an overview of content alignment and its relationship to validity before discussing common alignment study methods. Each method’s conceptualization of alignment is discussed, along with more practical issues such as the materials and participants required and steps for carrying out each method. The final component of this module includes a deeper exploration of the Webb method, one of the most popular content alignment study approaches. This includes a detailed and practical discussion of the steps followed in a Webb alignment study, as well as how to interpret the various alignment indices provided in the Webb framework.

    Key words: Achieve methodology, content alignment, content area standards, content validity,  standards-based assessment, Surveys of Enacted Curriculum, Webb methodology

    Katherine A. Reynolds

    Assistant Research Director

    Boston College

    Katherine completed her Ph.D. in Measurement, Evaluation, Statistics, and Assessment in 2020 at Boston College. Her work focused on scale development and other applications of psychometric research. She has taught graduate courses in research methods and undergraduate courses in the foundations of education. She has worked on several alignment-related projects, including facilitating panels for Webb alignment studies.

    Contact Katherine via katherir@bc.edu

    Sebastian Moncaleano

    Senior Research Specialist

    Boston College

    Sebastian completed his Ph.D. in Measurement, Evaluation, Statistics and Assessment at Boston College in 2021 where he focused his doctoral research on the value of technology-enhanced items in computer-based educational assessments. He has taught graduate courses in introductory statistics and assessment development. He has also conducted content alignment studies for the Department of Elementary and Secondary dedication in Massachusetts.

    Contact Sebastian via moncalea@bc.edu

  • Contains 6 Component(s) Recorded On: 11/15/2021

    ​In this digital ITEMS module, Dr. Hong Jiao and Dr. Manqian Liao describe testlet response theory and associated measurement models for the construction and evaluation of new measures and scales when local item dependence is present. Key words: Bayesian estimation, innovative assessment, item response theory, local item dependence, multi-part items, paired passages, testlets, testlet response theory

    In this digital ITEMS module, Dr. Hong Jiao and Dr. Manqian Liao describe testlet response theory for the construction and evaluation of new measures and scales. They start with an introduction to the needs of testlets when local item dependence is present and then introduce the basic testlet response models proposed from different theoretical frameworks built upon standard item response theory models. Furthermore, they introduce different methods for model parameter estimation and related software programs that implement them.  Finally, they showcase further extensions of the extended testlet response models for more complex local item dependence issues in innovative assessment. The module is designed for students, researchers, and data scientists in various disciplines such as psychology, sociology, education, business, health and other social sciences in developing testlet-based assessment. It contains audio-narrated slides, sample data, syntax filesdiagnostic quiz questions, data-based activities, curated resources, and a glossary.  

    Key words: Bayesian estimation, innovative assessment, item response theorylocal item dependence, multi-part items, paired passages, testletstestlet response theory 

    Hong Jiao

    Professor

    University of Maryland, College Park

    Hong Jiao is currently a professor at the University of Maryland (UMD), College Park specializing in educational measurement and psychometrics in large-scale assessment. She received her doctoral degree from Florida State University. Prior to joining the faculty in Measurement, Statistics, and Evaluation at UMD, she worked as a psychometrician at Harcourt Assessment on different state assessment programs. Her methodological research is to improve the practice in educational and psychological assessment and develop solutions to emerging psychometric challenges. Many of these are due to the use of more complex innovative assessment formats. Two areas of her research include methodological research on local dependence due to the use of testlet and Bayesian model parameter estimation. Her methodological research has been recognized by a national award, academic work including numerous edited books, book chapters, refereed journal papers, and national and international invited and refereed presentations and different research grants and contracts on which she serve as PI or CO-PI. Hong Jiao proposed a multilevel testlet model for mixed-format tests that won the 2014 Bradley Hanson Award for Contributions to Educational Measurement by the National Council on Measurement in Education.

    Contact Hong at hjiao@umd.edu

    Manqian Liao

    Psychometrician

    Duolingo

    Manqian Liao is a psychometrician at Duolingo where she conducts validity and fairness research on the Duolingo English Test. Among other things, she has worked on investigating the differential item functioning in the Duolingo English Test items and evaluating the efficiency of the item selection algorithm. Manqian received her Ph.D. degree in Measurement, Statistics and Evaluation from University of Maryland, College Park. Her research interest focuses on item response theory (IRT) and diagnostic classification models (DCM). Her dissertation is on modeling multiple problem-solving strategies and strategy shift with DCM 

    Contact Manqian at mancyliao@gmail.com

  • Contains 6 Component(s) Recorded On: 06/14/2021

    ​In this digital ITEMS module, Dr. Jade Caines Lee provides an opportunity for learners to gain introductory-level knowledge of educational assessment. The module’s framework will allow K-12 teachers, school building leaders, and district-level administrators to build “literacy” in three key assessment areas: measurement, testing, and data. Key Words: assessment literacy, classroom assessment, data, educational measurement, formative assessment, K-12 education, public schooling, reliability, summative assessment, validity

    In this digital ITEMS module, Dr. Jade Caines Lee provides an opportunity for learners to gain introductory-level knowledge of educational assessment. The module’s framework will allow K-12 teachers, school building leaders, and district-level administrators to build “literacy” in three key assessment areas: measurement, testing, and data. The module will also give learners an opportunity to apply their new assessment knowledge in scenario-based exercises. Through consumption of narrated slides, real-life application exercises, and a robust list of resources, educational practitioners and leaders will have a more nuanced understanding of key assessment topics, as well as a deeper appreciation for the application of educational assessment.  

    Key Words: assessment literacy, classroom assessment, data, educational measurement, formative assessment, K-12 education, public schooling, reliability, summative assessmentvalidity 

    Jade Caines Lee

    Assistant Professor

    Clark Atlanta University

    Jade Caines Lee is an Assistant Professor in the Department of Educational Leadership at Clark Atlanta University. She began her career as an elementary and secondary public school teacher in metropolitan Atlanta and New York City. She primarily taught middle school English for almost a decade and has also worked with graduate students at the university level teaching educational statistics, educational assessment, and educational evaluation courses for over 8 years. Dr. Lee has also worked in various applied research contexts including the National Center for Research on Evaluation, Standards, & Student Testing at UCLA and the Board of Regents of the University System of Georgia. She received her undergraduate degree in Urban Education from Stanford University, her masters degree from Brooklyn College, and her doctorate in Educational Studies from Emory University. Dr. Lee’s scholarly interests stem from her experiences as an urban, public school educator. She had enduring questions related to the validity and fairness of instruments, especially when used in high-stakes contexts. In terms of classroom assessments, she struggled to make sense of how to create valid and reliable items and tasks that could lead to feedback on the effectiveness of her teaching. This sparked her interest in assessment and evaluation literacy and has permeated her scholarly endeavors ever since. 

    Contact Jade at jade.caines@gmail.com

  • Contains 6 Component(s) Recorded On: 03/13/2021

    In this digital ITEMS module, Dr. Terry Ackerman and Dr. Qing Xie cover the underlying theory and application of multidimensional item response theory models from a visual perspective. Keywords: centroid plot, clamshell plot, contour plot, item information curve, item information surface, multidimensional item response theory, MIRT, response surface, RShiny, test characteristic curve, test characteristic surface, vector

    In this digital ITEMS module, Dr. Terry Ackerman and Dr. Qing Xie cover the underlying theory and application of multidimensional item response theory models from a visual perspective. They begin the module with a brief review of how to interpret evidence of dimensionality of test data. They then examine the basic characteristics of unidimensional IRT models and how these concepts change when the IRT model is expanded to two dimensions. This leads to a more in-depth discussion of how unidimensional item characteristic curves change in two-dimensional models and can be represented as a surface, as a contour plot, or collectively as a set of vectors. They then expanded this to the test level where test characteristic curves become test characteristic surfaces and with accompanying contours. They include additional discussions on how to compute information and represent it in terms of “clamshell”, number, or centroid plots. The module includes audio-narrated slides as well as the usual package of the usual package of curated resources, a glossary, data activities, and quiz questions with diagnostic feedback.

    Keywords: centroid plot, clamshell plot, contour plot, item information curve, item information surface, multidimensional item response theory, MIRT, response surface, RShiny, test characteristic curve, test characteristic surface, vector

    Terry Ackerman

    Distinguished Visiting Professor

    University of Iowa

    After receiving his Ph.D., Dr. Ackerman worked for five years as a psychometrician at ACT where one of his primary responsibilities involved working on an Office of Naval Research grant that focused on multidimensional item response theory (MIRT) and its potential applications to standardized tests.  Given his strong interest and passion for teaching, he then left ACT and went to the University of Illinois at Urbana-Champaign where he taught graduate courses in statistics and testing and measurement. During that time, Dr. Ackerman was fortunate to have great colleagues and mentors at Illinois including Dr. William Stout, Dr. Rod McDonald, and Dr. Larry Hubert. His research continued to focus on MIRT and expanded to include differential item and test functioning. After 10 years at Illinois he moved to the University of North Carolina at Greensboro where, as a department chair, he helped build a strong program in educational testing and measurement and developed a strong internship program. He chaired and sat on several technical advisory committees for several testing companies including the American Institute for Certified Public Accountants (AICPA), Measured Progress, the College Board, ETS, and the Defense Advisory Committee, which oversaw the testing for the U.S. military. During his 17 years as a professor, chair, and associate dean, Dr. Ackerman was elected to serve as the president of the National Council on Measurement in Education (NCME) and the Psychometric Society. In 2016, he briefly returned to ACT as the company’s first Lindquist Chair before moving to the University of Iowa as a Distinguished Visiting Professor.

    Contact Terry via terry-ackerman@uiowa.edu

    Qing Xie

    U.S. Food and Drug Administration

    Qing Xie received her Ph.D. in Educational Measurement and Statistics and M.S. in Statistics from the University of Iowa. During her graduate study, she worked as a Research Assistant (RA) in the Psychometric Research Department at ACT for about five years, where she provided psychometric and statistical support on operational tasks for large-scale assessments and worked both independently and collaboratively on multiple quantitative research projects. Later, she was a RA at the Iowa Testing Programs, University of Iowa and an Associate Psychometrician at ETS. She now works at the Center for Drug Evaluation and Research at the U.S. Food and Drug Administration.

    Contact Qing via quing2xie@gmail.com

  • Contains 5 Component(s) Recorded On: 03/04/2021

    In this digital ITEMS module, Dr. Chad Gotch walks through different forms of assessment, from everyday actions that are almost invisible, to high-profile, annual, large-scale tests with an eye towards educational decision-making. Keywords: assessment literacy, classroom assessment, decision-making, formative assessment, in-the-moment assessment, interim assessment, large-scale assessment, major milepost, periodic check-in, unit test

    In this digital ITEMS module, Dr. Chad Gotch walks through different forms of assessment, from everyday actions that are almost invisible, to high-profile, annual, large-scale tests with an eye towards educational decision-making. At each stage, he illustrates the form of assessment with real-life examples, pairs it with ideal types of instructional or programmatic decisions, and notes common mismatches between certain decisions and forms of assessment. Teachers, administrators, and policymakers who complete the module will build a foundation to use assessment appropriately and effectively for the benefit of student learning. By going through they module, they will appreciate how assessment, when done well, empowers students and educators and, when done poorly, undermines foundational educational goals and sows anxiety and discord. The module contains audio-narrated slides, interactive exercises with illustrative videos, and a curated set of resources.

    Keywords: assessment literacy, classroom assessment, decision-making, formative assessment, in-the-moment assessment, interim assessment, large-scale assessment, major milepost, periodic check-in, unit test

    Chad Gotch

    Washington State University

    Chad Gotch is an Assistant Professor in the Educational Psychology program at Washington State University. His first experiences as a teacher came through environmental and informal science education for children during his undergraduate and Master’s degrees. After working in program assessment within university administration, Chad moved into the world of educational measurement and student assessment. Currently, he works to maximize appropriate and effective use of educational assessment. To this end, he studies the development of assessment literacy among both pre-service and in-service teachers, the communication of assessment results (e.g., score reporting), and the construction of validity arguments from both technical and non-technical evidence. These complementary lines of research inform the life cycle of assessment, from development to use and policy.

    Chad previously partnered with NCME on the development of the Fundamentals of Classroom Assessment video for NCME (https://vimeo.com/212410753). At the university level, he teaches courses in educational statistics, educational measurement, and classroom assessment. Chad has served in an advisory role with the Washington Office of Superintendent of Public Instruction’s consolidated plan submission for the Every Student Succeeds Act (ESSA) and as a consultant with the Oregon Department of Education in its teacher and administrator education efforts. He has worked with K-16 educators through both workshops and one-on-one consultation on various aspects of student assessment, and is the lead author of the chapter “Preparing Pre-Service Teachers for Assessment of, for, and as Learning” in the forthcoming book Teaching on Assessment from Information Age Publishing.

    Contact Chad via cgotch@wsu.edu

  • Product not yet rated Contains 7 Component(s) Recorded On: 01/21/2021

    In this digital ITEMS module, Dr. Francis O’Donnell and Dr. April Zenisky provide a firm grounding in the conceptual and operational considerations around results reporting for summative large-scale assessment and, throughout the module, highlight research-grounded good practices, concluding with some principles and ideas around conducting reporting research. Keywords: data, large-scale assessment, results, score reporting, validity, visualization

    In this digital ITEMS module, Dr. Francis O’Donnell and Dr. April Zenisky provide a firm grounding in the conceptual and operational considerations around results reporting for summative large-scale assessment. They anchor the module in the position that results reporting must be approached as a data-driven story that is purposefully designed to communicate specific information to accomplish specific goals. They further connect their overview to various aspects of validity and present different conceptual frameworks and practical models for report development. Throughout the module, they highlight research-grounded good practices, concluding with some principles and ideas around conducting reporting research. The module contains audio-narrated slides, an interview, interactive activities, and additional resources such as a glossary and reference documents.

    Keywords: data, large-scale assessment, results, score reporting, validity, visualization

    Francis O'Donnell

    National Board of Medical Examiners

    Francis O’Donnell is a psychometrician at the National Board of Medical Examiners. She earned her Ph.D. from the University of Massachusetts Amherst in Research, Educational Measurement, and Psychometrics in 2019. During her doctoral studies, she contributed to several projects on results reporting for K-12 and licensure assessments, which culminated in a dissertation about how teachers, parents, and students interpret achievement level labels. Francis’s research has been presented at multiple conferences and a book chapter she wrote with Dr. Stephen Sireci about reporting in credentialing and admissions contexts appears in Score Reporting: Research and Applications (Zapata-Rivera, 2018). In her current role, she oversees psychometric activities for medical education and certification examinations and researches topics such as validity, fairness, and innovative approaches to reporting results.

    Contact Francis via fodonnell@nbme.org

    April Zenisky

    University of Massachusetts Amherst

    April L. Zenisky is Research Associate Professor in the Department of Educational Policy, Research, and Administration in the College of Education at the University of Massachusetts (UMass) Amherst, and Director of Computer-Based Testing Initiatives in UMass’ Center for Educational Assessment (CEA). At UMass she leads and contributes to several externally-funded projects, and teaches courses and workshops on various topics, including test construction.  April’s main research interests include results reporting, technology-based item types, and computerized test designs. Her collaborative work on results reporting with Ronald K. Hambleton has advanced best practices for report development relative to both individual and group reporting and has explored emerging strategies for online reporting efforts. She has presented her research at various national and international conferences and has authored and co-authored a number of book chapters.

    Contact April via azenisky@umass.edu

  • Contains 7 Component(s) Recorded On: 11/30/2020

    In this digital ITEMS module, Dr. Caroline Wylie reviews the Classroom Assessment Standards with their three sets of standards: (1) Foundations (these six standards provide the basis for developing and implementing sound and fair classroom assessment); (2) Use (these five standards follow a logical progression from the selection and development of classroom assessments to the communication of the assessment results); and (3) Quality (these five standards guide teachers in providing accurate, reliable, and fair classroom assessment results for all students). Keywords: assessment design, classroom assessment, formative assessment, professional development

    In this digital ITEMS module, Dr. Caroline Wylie reviews the Classroom Assessment Standards developed under the auspices of the Joint Committee on Standards for Educational Evaluation and briefly contrasts their role with the Standards for Educational and Psychological Testing (2014) issued by APA, AERA, and NCME, which are commonly used to inform quality metrics for high stakes and large-scale assessments. She includes details on the three categories of standards: (1) Foundations (these six standards provide the basis for developing and implementing sound and fair classroom assessment); (2) Use (these five standards follow a logical progression from the selection and development of classroom assessments to the communication of the assessment results); and (3) Quality (these five standards guide teachers in providing accurate, reliable, and fair classroom assessment results for all students). The module contains audio-narrated slides, reflection questions, and a set of resources to support application of the Classroom Assessment Standards to classroom assessments within teams of teachers and/or curriculum and assessment developers.

    Keywords: assessment design, classroom assessment, formative assessment, professional development

    Caroline Wylie

    Educational Testing Service (ETS)

    Caroline Wylie is a Research Director in the Student and Teacher Research Center and Senior Research Scientist at ETS. Her current research centers on issues around balanced assessment systems, with a focus on the use of formative assessment to improve classroom teaching and learning. She has led studies related to the creation of effective, scalable and sustainable teacher professional development, focused on formative assessment, on the formative use of diagnostic questions for classroom-based assessment, assessment literacy and on the role of learning progressions to support formative assessment in mathematics and science. She is specifically interested in issues of rater quality as it relates to formative classroom observations, and the relationship between observations, feedback and changes to practice. She was one of the co-authors of the Classroom Assessment Standards, and currently serves as the co-chair for the NCME Classroom Assessment Task Force.


    Contact Caroline via ecwylie@ets.org