The Rise and Fall of the Teacher Evaluation Reform Empire
Teacher evaluation reform during the late 2000s and 2010s was one of the fastest and widespread education policy changes in recent history. Thanks mostly to Race to the Top and ESEA “waivers,” over a period of about 10 years, the vast majority of the nation’s school districts installed new teacher evaluations. These new systems were quite different from their predecessors in terms of design, with 3-5 (rather than dichotomous) rating categories incorporating multiple measures (including some based on student testing results). And, in many states, there were varying degrees of rewards and/or consequences tied to the ratings (Steinberg and Donaldson 2016).
A recent working paper offers what is to date the most sweeping assessment of the impact of teacher evaluation reform on student outcomes, with data from 44 states and D.C. As usual, I would encourage you to read the whole paper (here's an earlier ungated version released in late 2021). It is terrific work by a great team of researchers (Joshua Bleiberg, Eric Brunner, Erika Harbatkin, Matthew Kraft, and Matthew Springer), and I’m going to describe the findings only superficially. We’ll get into a little more detail below, but the long and short of it is that evaluation reform had no statistically detectable aggregate effect on student test scores or attainment (i.e., graduation or college enrollment).
This timely analysis, in combination with the research on evaluations over the past few years, provides an opportunity to look back on this enormous reform effort, and whether and how states and districts might move forward.