Research On Teacher Evaluation Metrics: The Weaponization Of Correlations
Our guest author today is Cara Jackson, Assistant Director of Research and Evaluation at the Urban Teacher Center.
In recent years, many districts have implemented multiple-measure teacher evaluation systems, partly in response to federal pressure from No Child Left Behind waivers and incentives from the Race to the Top grant program. These systems have not been without controversy, largely owing to the perception – not entirely unfounded - that such systems might be used to penalize teachers. One ongoing controversy in the field of teacher evaluation is whether these measures are sufficiently reliable and valid to be used for high-stakes decisions, such as dismissal or tenure. That is a topic that deserves considerably more attention than a single post; here, I discuss just one of the issues that arises when investigating validity.
The diagram below is a visualization of a multiple-measure evaluation system, one that combines information on teaching practice (e.g. ratings from a classroom observation rubric) with student achievement-based measures (e.g. value-added or student growth percentiles) and student surveys. The system need not be limited to three components; the point is simply that classroom observations are not the sole means of evaluating teachers.
In validating the various components of an evaluation system, researchers often examine their correlation with other components. To the extent that each component is an attempt to capture something about the teacher’s underlying effectiveness, it’s reasonable to expect that different measurements taken of the same teacher will be positively related. For example, we might examine whether ratings from a classroom observation rubric are positively correlated with value-added.