The Great Proficiency Debate

A couple of weeks ago, Mike Petrilli of the Fordham Institute made the case that absolute proficiency rates should not be used as measures of school effectiveness, as they are heavily dependent on where students “start out” upon entry to the school. A few days later, Fordham president Checker Finn offered a defense of proficiency rates, noting that how much students know is substantively important, and associated with meaningful outcomes later in life.

They’re both correct. This is not a debate about whether proficiency rates are at all useful (by the way, I don't read Petrilli as saying that). It’s about how they should be used and how they should not.

Let’s keep this simple. Here is a quick, highly simplified list of how I would recommend interpreting and using absolute proficiency rates, and how I would avoid using them.

Before proceeding, however, a quick clarification: Much of this debate is less about proficiency rates than about absolute performance measures in general – that is, how highly students score (as opposed to growth-oriented measures, which are focused on progress over time). Proficiency rates are a very common example of absolute performance measure, but most of the points below -- basically, all but the last bullet -- might also apply to other status measures, such as average scale scores. In addition, for the purposes of this discussion, let's put aside the issue of the choice of "proficient" cutoff score, and how it is to some degree arbitrary (though that is indeed an important issue in any discussion about cutpoint-based rates).

How to use proficiency rates:

To (carefully) summarize or compare, in an accessible manner, the performance of a group or groups of students, such as those attending the same grade, school, or district, in any given year;
To target resources, such as additional funding or a tutoring program, at entities (e.g., schools, districts) with particularly low-performing students (or, perhaps, target other interventions at high-performing students);
As a partial measure for parents to evaluate the suitability of schools for their children (insofar as they have an interest in peer groups).

How not to use proficiency rates:

To draw anything beyond the most cautious, preliminary conclusions about actual school performance, rather than student performance – i.e., simple, single-year rates/scores themselves cannot tell you much about the degree to which schools are contributing to student learning (at least insofar as tests can measure that);
To close, restructure or make other high-stakes decisions about schools that should be based on the assessment of school performance per se (unless you want to "prioritize" these interventions for schools with lower-performing students, as in this proposal);
To measure trends in student performance over time or achievement gaps (static or over time), unless necessary (e.g., average scores are not available).

In short, proficiency rates (and other absolute performance measures) have a very legitimate role to play in education policy, including in accountability systems (though I feel obliged to add/reiterate that these "percent above the line" rates are an exceedingly crude, potentially distorted way to express the data, and I would generally recommend average scores when they are available and suitable). This is about using them appropriately.

And the sad reality is that NCLB and the more recent ESEA waivers do in fact include provisions that use absolute proficiency rates inappropriately - e.g., for identifying "failing schools." Some states are also using growth models, which is a positive development that will hopefully continue. In any case, until we begin to recognize the distinction between school and student performance, and to interpret and use measures more cautiously and with that distinction in mind, accountability policy in the U.S. will rest upon a rather shaky measurement foundation.

- Matt Di Carlo

Blog Topics

Below are two excerpts from recent press coverage of the recent elementary ELA and math test scores (from two differently-leaning sources one might argue). Despite your advice above, you’ll note the reporters’ use of the terms “flunk” and “passed” in these articles.

Achievement gap widens for students after city’s new standardized tests
http://www.nydailynews.com/new-york/education/achievement-gap-widens-ci…

"The harder state tests did more than cause two-thirds of city students to flunk..."

Fewer than One Third of New York City Students Pass State Tests
http://www.wnyc.org/blogs/schoolbook/2013/aug/07/fewer-one-third-new-yo…

"Test scores for New York City students plummeted this year, with 26.4 percent of third through eighth graders passing the English tests and 29.6 percent passing the math tests."

Hi Matt,

I enjoy reading your blog entries. I would like to add that people should be cautious of proficiency "gains" (should they choose to ignore your last bullet point). Many people add and subtract proficiency rates without considering that a percentage point change in the middle of the distribution (e.g., from 50% to 51%) is not the same magnitude as a percentage point change closer to the tails (e.g., from 90% to 91%). The former gain is smaller than the latter when proficiency rates are more appropriately expressed as log-odds. Of course, many people don't understand what proficiency rates mean, let alone proficiency log-odds.

Regards,
Chris