ITEMS Portal
Module 43: Data Mining for Classification and Regression
5 (4 votes)
-
Register
- Learner - Free!
Data mining methods for classification and regression are becoming increasingly popular in various scientific fields. However, these methods have not been explored much in educational measurement. This module first provides a review, which should be accessible to a wide audience in education measurement, of some of these methods. The module then demonstrates using three real-data examples that these methods may lead to an improvement over traditionally used methods such as linear and logistic regression in educational measurement.
Keywords: bagging, boosting, classification and regression tree, CART, cross-validation error, data mining, predicted values, random forests, supervised learning, test error, TIMSS
Sandip Sinharay
Principal Research Scientist, Educational Testing Service
Sandip Sinharay is a principal research scientist in the Research and Development division at ETS. He received his Ph.D degree in statistics from Iowa State University in 2001. He was editor of the Journal of Educational and Behavioral Statistics between 2011 and 2014. Sandip Sinharay has received four awards from the National Council on Measurement in Education: the award for Technical or Scientific Contributions to the Field of Educational Measurement (in 2009 and 2015), the Jason Millman Promising Measurement Scholar Award (2006), and the Alicia Cascallar Award for an Outstanding Paper by an Early Career Scholar (2005). He received the ETS Scientist award in 2008 and the ETS Presidential award twice. He has coedited two published volumes and authored or coauthored more than 75 articles in peer-reviewed statistics and psychometrics journals and edited books.