Assessing Ourselves To Death

** Reprinted here in the Washington Post

I have two points to make. The first is something that I think everyone knows: Educational outcomes, such as graduation and test scores, are signals of or proxies for the traits that lead to success in life, not the cause of that success.

For example, it is well-documented that high school graduates earn more, on average, than non-graduates. Thus, one often hears arguments that increasing graduation rates will drastically improve students’ future prospects, and the performance of the economy overall. Well, not exactly.

The piece of paper, of course, only goes so far. Rather, the benefits of graduation arise because graduates are more likely to possess the skills – including the critical non-cognitive sort – that make people good employees (and, on a highly related note, because employers know that, and use credentials to screen applicants).

We could very easily increase the graduation rate by easing requirements, but this wouldn’t do much to help kids advance in the labor market. They might get a few more calls for interviews, but over the long haul, they’d still be at a tremendous disadvantage if they lacked the required skills and work habits.

Are Charter Caps Keeping Great Schools From Opening?

** Reprinted here in the Washington Post

Charter school “caps” are state-imposed limits on the size or growth of charter sectors. Currently, around 25 states set caps on schools or enrollment, with wide variation in terms of specifics: Some states simply set a cap on the number of schools (or charters in force); others limit annual growth; and still others specify caps on both growth and size (there are also a few places that cap proportional spending, coverage by individual operators and other dimensions).

A great many charter school supporters strongly support the lifting of these restrictions, arguing that they prevent the opening of high-quality schools. This is, of course, an oversimplification at best, as lifting caps could just as easily lead to the proliferation of the many unsuccessful charters. If the charter school experiment has taught us anything, it’s that these schools are anything but sure bets, and that even includes the tiny handful of highly successful models such as KIPP.*

Overall, the only direct impact of charter caps is to limit the potential size or growth of a state’s charter school sector. Assessing their implications for quality, on the other hand, is complicated, and there is every reason to believe that the impact of caps, and thus the basis of arguments for lifting them, varies by context – including the size and quality of states’ current sectors, as well as the criteria by which low-performing charters are closed and new ones are authorized. 

New Teacher Evaluations Are A Long-Term Investment, Not Test Score Arbitrage

One of the most important things in education policy to keep an eye on is the first round of changes to new teacher evaluation systems. Given all the moving parts and the lack of evidence on how these systems should be designed and their impact, course adjustments along the way are not just inevitable, but absolutely essential.

Changes might be guided by different types of evidence, such as feedback from teachers and administrators or analysis of ratings data. And, of course, human judgment will play a big role. One thing that states and districts should not be doing, however, is assessing their new systems – or making changes to them – based whether or not raw overall test scores go up or down within the first few years.

Here’s a little reality check: Even the best-designed, best-implemented new evaluations are unlikely to have an immediate measurable impact on aggregate student performance. Evaluations are an investment, not a quick fix. And they are not risk-free. Their effects will depend on the quality of systems, how current teachers and administrators react to them and how all of this shapes and plays out in the teacher labor market. As I’ve said before, the realistic expectation for overall performance – and this is no guarantee – is that there will be some very small, gradual improvements, unfolding over a period of years and decades.

States and districts that expect anything more risk making poor decisions during these crucial, early phases.

A Look At The Changes To D.C.'s Teacher Evaluation System

D.C. Public Schools (DCPS) recently announced a few significant changes to its teacher evaluation system (called IMPACT), including the alteration of its test-based components, the creation of a new performance category (“developing”), and a few tweaks to the observational component (discussed below). These changes will be effective starting this year.

As with any new evaluation system, a period of adjustment and revision should be expected and encouraged (though it might be preferable if the first round of changes occurs during a phase-in period, prior to stakes becoming attached). Yet, despite all the attention given to the IMPACT system over the past few years, these new changes have not been discussed much beyond a few quick news articles.

I think that’s unfortunate: DCPS is an early adopter of the “new breed” of teacher evaluation policies being rolled out across the nation, and any adjustments to IMPACT’s design – presumably based on results and feedback – could provide valuable lessons for states and districts in earlier phases of the process.

Accordingly, I thought I would take a quick look at three of these changes.

School Grades For School Grades' Sake

I have reviewed, albeit superficially, the test-based components of several states’ school rating systems (e.g., OH, FL, NYC, LA, CO), with a particular focus on the degree to which they are actually measuring student performance (how highly students score), rather than school effectiveness per se (whether students are making progress). Both types of measures have a role to play in accountability systems, even if they are often confused or conflated, resulting in widespread misinterpretation of what the final ratings actually mean, and many state systems’ failure to tailor interventions to the indicators being used.

One aspect of these systems that I rarely discuss is the possibility that the ratings systems are an end in themselves. That is, the idea that public ratings, no matter how they are constructed, provide an incentive for schools to get better. From this perspective, even if the ratings are misinterpreted or imprecise, they might still “work."*

There’s obviously something to this. After all, the central purpose of any accountability system is less about closing or intervening in a few schools than about giving all schools incentive to up their respective games. And, no matter how you feel about school rating systems, there can be little doubt that people pay attention to them. Educators and school administrators do so, not only because they fear closure or desire monetary rewards; they also take pride in what they do, and they like being recognized for it. In short, my somewhat technocratic viewpoint on school ratings ignores the fact that their purpose is less about rigorous measurement than encouraging improvement.

Senate's Harkin-Enzi ESEA Plan Is A Step Sideways

Our guest authors today are Morgan Polikoff and Andrew McEachin. Morgan is Assistant Professor in the Rossier School of Education at the University of Southern California. Andrew is an Institute of Education Science postdoctoral fellow at the University of Virginia.

By now, it is painfully clear that Congress will not be revising the Elementary and Secondary Education Act (ESEA) before the November elections. And with the new ESEA waivers, who knows when the revision will happen? Congress, however, seems to have some ideas about what next-generation accountability should look like, so we thought it might be useful to examine one leading proposal and see what the likely results would be.

The proposal we refer to is the Harkin-Enzi plan, available here for review. Briefly, the plan identifies 15 percent of schools as targets of intervention, classified in three groups. First are the persistently low-achieving schools (PLAS); these are the 5 percent of schools that are the lowest performers, based on achievement level or a combination of level and growth. Next are the achievement gap schools (AGS); these are the 5 percent of schools with the largest achievement gaps between any two subgroups. Last are the lowest subgroup achievement schools (LSAS); these are the 5 percent of schools with the lowest achievement for any significant subgroup.

The goal of this proposal is both to reduce the number of schools that are identified as low-performing and to create a new operational definition of consistently low-performing schools. To that end, we wanted to know what kinds of schools these groups would target and how stable the classifications would be over time.

Labor Market Behavior Actually Matters In Labor Market-Based Education Reform

Economist Jesse Rothstein recently released a working paper about which I am compelled to write, as it speaks directly to so many of the issues that we have raised here over the past year or two. The purpose of Rothstein’s analysis is to move beyond the talking points about teaching quality in order to see if strategies that have been proposed for improving it might yield benefits. In particular, he examines two labor market-oriented policies: performance pay and dismissing teachers.

Both strategies are, at their cores, focused on selection (and deselection) – in other words, attracting and retaining higher-performing candidates and exiting, directly or indirectly, lower-performing incumbents. Both also take time to work and have yet to be experimented with systematically in most places; thus, there is relatively little evidence on the long-term effects of either.

Rothstein’s approach is to model this complex dynamic, specifically the labor market behavior of teachers under these policies (i.e., choosing, leaving and staying in teaching), which is often ignored or assumed away, despite the fact that it is so fundamental to the policies themselves. He then calculates what would happen under this model as a result of performance pay and dismissal policies – that is, how they would affect the teacher labor market and, ultimately, student performance.*

Of course, this is just a simulation, and must be (carefully) interpreted as such, but I think the approach and findings help shed light on three fundamental points about education reform in the U.S.

The Education Reform Movement: Reset Or Redo?

Our guest author today is Dr. Clifford B. Janey, former superintendent for the Newark Public Schools, District of Columbia Public Schools, and Rochester City School District. He is currently a Senior Weismann Fellow at the Bankstreet College of Education in New York City, and a Shanker Institute board member.

For too many students, families, and communities, the high school diploma represents either a dream deferred or a broken contract between citizens and the stewards of America's modern democracy. With the reform movement’s unrelenting focus on testing and its win/lose consequences for students and staff, the high school diploma, which should signify college and work readiness, has lost its value.

Not including the over seven thousand students who drop out of high school daily, the gap between the percentage of those who graduate and their readiness for college success will continue to worsen the social and income inequalities in life. Recent studies report that America has the highest number of people (46.2 million) living in poverty since data collection began in 1959. While poverty and its conditions have been unforgiving, policy makers and education reformers have largely ignored this reality. Rebuttals to this argument are interesting, but, without fundamental change, the predictable growth within the ranks of poverty will continue.

A framework within which solutions will thrive requires a redo of the national reform focus, not merely a reset of existing efforts—including teacher evaluation systems, closing low performing schools (and opening up new ones that are at best marginally better), and increasing the opportunity for mayoral control (which still commands attention but with little assurance of transparency).

Staff Matters: Social Resilience In Schools

In the world of education, particularly in the United States, educational fads, policy agendas, and funding priorities tend to change rapidly. The attention of education research fluctuates accordingly. And, as David Cohen persuasively argues in Teaching and Its Predicaments, the nation has little coherent educational infrastructure to fall back upon. As a result of all this, teachers’ work is almost always surrounded by important levels of uncertainty (e.g., lack of a common curricula) and variation. In such a context, it is no surprise that collaboration and collegiality figure prominently in teachers’ world (and work) views.

After all, difficulties can be dealt with more effectively when/if individuals are situated in supportive and close-knit social networks from which to draw strength and resources. In other words, in the absence of other forms of stability, the ability of a group – a group of teachers in this case – to work together becomes indispensable to cope with challenges and change.

The idea that teachers’ jobs are surrounded by uncertainty made me of think problems often encountered in the field of security. In this sector, because threats are increasingly complex and unpredictable, much of the focus has shifted away from heightened protection and toward increased resilience. Resilience is often understood as the ability of communities to survive and thrive after disasters or emergencies.

The Allure Of Teacher Quality

Those following education know that policy focused on "teacher quality" is by far the dominant paradigm for improving  schools over the past few years. Some (but not nearly all) components of this all-hands-on-deck effort are perplexing to many teachers, and have generated quite a bit of pushback. No matter one’s opinion of this approach, however, what drives it is the tantalizing allure of variation in teacher quality.

Fueled by the ever-increasing availability of detailed test score datasets linking teachers to students, the research literature on teachers’ test-based effectiveness has grown rapidly, in both size and sophistication. Analysis after analysis finds that, all else being equal, the variation in teachers’ estimated effects on students' test growth – the difference between the “top” and “bottom” teachers – is very large. In any given year, some teachers’ students make huge progress, others’ very little. Even if part of this estimated variation is attributable to confounding factors, the discrepancies are still larger than most any other measured "input" within the jurisdiction of education policy. The underlying assumption here is that “true” teacher quality varies to a degree that is at least somewhat comparable in magnitude to the spread of the test-based estimates.

Perhaps that's the case, but it does not, by itself, help much. The key question is whether and how we can measure teacher performance at the individual level and, more importantly, influence the distribution – that is, to raise the ceiling, the middle and/or the floor. The variation hangs out there like a drug to which we’re addicted, but haven’t really figured out how to administer. If there was some way to harness it efficiently, the potential benefits could be considerable. The focus of current education policy is in large part an effort to do anything and everything to try and figure this out. And, as might be expected given the enormity of the task, progress has been slow.