Trial And Error Is Fine, So Long As You Know The Difference
It’s fair to say that improved teacher evaluation is the cornerstone of most current education reform efforts. Although very few people have disagreed on the need to design and implement new evaluation systems, there has been a great deal of disagreement over how best to do so – specifically with regard to the incorporation of test-based measures of teacher productivity (i.e., value-added and other growth model estimates).
The use of these measures has become a polarizing issue. Opponents tend to adamantly object to any degree of incorporation, while many proponents do not consider new evaluations meaningful unless they include test-based measures as a major element (say, at least 40-50 percent). Despite the air of certainty on both sides, this debate has mostly been proceeding based on speculation. The new evaluations are just getting up and running, and there is virtually no evidence as to their effects under actual high-stakes implementation.
For my part, I’ve said many times that I'm receptive to trying value-added as a component in evaluations (see here and here), though I disagree strongly with the details of how it’s being done in most places. But there’s nothing necessarily wrong with divergent opinions over an untested policy intervention, or with trying one. There is, however, something wrong with fully implementing such a policy without adequate field testing, or at least ensuring that the costs and effects will be carefully evaluated post-implementation. To date, virtually no states/districts of which I'm aware have mandated large-scale, independent evaluations of their new systems.*
If this is indeed the case, the breathless, speculative debate happening now will only continue in perpetuity.