• The Middle Ground Between Opt Out And All In

    A couple of weeks ago, Michelle Rhee published an op-ed in the Washington Post speaking out against the so-called “opt out movement," which encourages parents to refuse to let their children take standardized tests.

    Personally, I oppose the “opt-out” phenomenon, but I also think it would be a mistake not to pay attention to its proponents’ fundamental issue – that standardized tests are potentially being misused and/or overused. This concern is legitimate and important. My sense is that “opting out” reflects a rather extreme version of this mindset, a belief that we cannot right the ship – i.e., we have gone so far and moved so carelessly with test-based accountability that there is no real hope that it can or will be fixed. This strikes me as a severe overreaction, but I understand the sentiment.

    That said, while most of Ms. Rhee’s op-ed is the standard, reasonable fare, some of it is also laced with precisely the kind of misconceptions that contribute to the apprehensions not only of anti-testing advocates, but also among those of us who occupy a middle ground - i.e., favor some test-based accountability, but are worried about getting it right.

  • New York City: The Mississippi Of The Twenty-First Century?

    Last month saw the publication of a new report, New York State’s Extreme School Segregation, produced by UCLA’s highly regarded Civil Rights Project. It confirmed what New York educators have suspected for some time: our schools are now the most racially segregated schools in the United States. New York’s African-American and Latino students experience “the highest concentration in intensely-segregated public schools (less than 10% white enrollment), the lowest exposure to white students, and the most uneven distribution with white students across schools."

    Driving the statewide numbers were schools in New York City, particularly charter schools. Inside New York City, “the vast majority of the charter schools were intensely segregated," the report concluded, significantly worse in this regard “than the record for public schools."

    New York State’s Extreme School Segregation provides a window into the intersection of race and class in the city’s schools. As a rule, the city’s racially integrated schools are middle class, in which middle-class white, Asian, African-American and Latino students all experience the educational benefits of racial diversity. By contrast, the city’s racially segregated public schools are generally segregated by both race and class: extreme school segregation involves high concentrations of African-American and Latino students living in poverty.

  • What Is Implicit Bias, And How Might It Affect Teachers And Students? (Part I)

    This is the first in a series of three posts about implicit bias. Here are the second and third parts.

    The research on implicit bias both fascinates and disturbs people. It’s pretty cool to realize that many everyday mental processes happen so quickly as to be imperceptible. But the fact that they are so automatic, and therefore outside of our conscious control, can be harder to stomach.

    In other words, the invisible mental shortcuts that allow us to function can be quite problematic – and a real barrier to social equality and fairness – in contexts where careful thinking and decision-making are necessary. Accumulating evidence reveals that “implicit biases” are linked to discriminatory outcomes ranging from the seemingly mundane, such as poorer quality interactions, to the highly consequential, such as constrained employment opportunities and a decreased likelihood of receiving life-saving emergency medical treatments.

    Two excellent questions about implicit bias came up during our last Good Schools Seminar on "Creating Safe and Supportive Schools."

  • When Growth Isn't Really Growth, Part Two

    Last year, we published a post that included a very simple graphical illustration of what changes in cross-sectional proficiency rates or scores actually tell us about schools’ test-based effectiveness (basically nothing).

    In reality, year-to-year changes in cross-sectional average rates or scores may reflect "real" improvement, at least to some degree, but, especially when measured at the school- or grade-level, they tend to be mostly error/imprecision (e.g., changes in the composition of the samples taking the test, measurement error and serious issues with converting scores to rates using cutpoints). This is why changes in scores often conflict with more rigorous indicators that employ longitudinal data.

    In the aforementioned post, however, I wanted to show what the changes meant even if most of these issues disappeared magicallyIn this one, I would like to extend this very simple illustration, as doing so will hopefully help shed a bit more light on the common (though mistaken) assumption that effective schools or policies should generate perpetual rate/score increases.

  • Estimated Versus Actual Days Of Learning In Charter School Studies

    One of the purely presentational aspects that separates the new “generation” of CREDO charter school analyses from the old is that the more recent reports convert estimated effect sizes from standard deviations into a “days of learning” metric. You can find similar approaches in other reports and papers as well.

    I am very supportive of efforts to make interpretation easier for those who aren’t accustomed to thinking in terms of standard deviations, so I like the basic motivation behind this. I do have concerns about this particular conversion -- specifically, that it overstates things a bit -- but I don’t want to get into that issue. If we just take CREDO’s “days of learning” conversion at face value, my primary, far more simple reaction to hearing that a given charter school sector's impact is equivalent to a given number of additional "days of learning" is to wonder: Does this charter sector actually offer additional “days of learning," in the form of longer school days and/or years?

    This matters to me because I (and many others) have long advocated moving past the charter versus regular public school “horserace” and trying to figure out why some charters seem to do very well and others do not. Additional time is one of the more compelling observable possibilities, and while they're not perfectly comparable, it fits nicely with the "days of learning" expression of effect sizes. Take New York City charter schools, for example.

  • Valuing Home Languages Sets The Foundation For Early Learning

    Our guest author today is Candis Grover, the Literacy & Spanish Content Manager at ReadyRosie.com, an online resource that models interactive oral language development activities that parents and caregivers of young children can do to encourage learning.

    Many advocates, policymakers, and researchers now recognize that a strong start requires more than just a year of pre-K. Research shows that promoting children’s success starts with helping parents recognize the importance of loving interactions and “conversations” with their babies.
    The above statement, which is taken from a recent report, Subprime Learning: Early Education in America since the Great Recession, emphasizes the role of parents as the earliest investors in the academic success of their children. This same report states that more than one in five of these families speaks a primary language other than English, and that this statistic could reach 40 percent by 2030. Despite the magnitude of these numbers, the Subprime Learning report asserts that the research on dual language learners has been largely ignored by those developing early childhood education policies and programs.
  • SIG And The High Price Of Cheap Evidence

    A few months ago, the U.S. Department of Education (USED) released the latest data from schools that received grants via the School Improvement (SIG) program. These data -- consisting solely of changes in proficiency rates -- were widely reported as an indication of “disappointing” or “mixed” results. Some even went as far as proclaiming the program a complete failure.

    Once again, I have to point out that this breaks almost every rule of testing data interpretation and policy analysis. I’m not going to repeat the arguments about why changes in cross-sectional proficiency rates are not policy evidence (see our posts here, here and here, or examples from the research literature here, here and here). Suffice it to say that the changes themselves are not even particularly good indicators of whether students’ test-based performance in these schools actually improved, to say nothing of whether it was the SIG grants that were responsible for the changes. There’s more to policy analysis than subtraction.

    So, in some respects, I would like to come to the defense of Secretary Arne Duncan and USED right now - not because I’m a big fan of the SIG program (I’m ambivalent at best), but rather because I believe in strong, patient policy evaluation, and these proficiency rate changes are virtually meaningless. Unfortunately, however, USED was the first to portray, albeit very cautiously, rate changes as evidence of SIG’s impact. In doing so, they provided a very effective example of why relying on bad evidence is a bad idea even if it supports your desired conclusions.

  • In Education Policy, Good Things Come In Small Packages

    A recent report from the U.S. Department of Education presented a summary of three recent studies of the differences in the effectiveness of teaching provided advantaged and disadvantaged students (with the former defined in terms of value-added scores, and the latter in terms of subsidized lunch eligibility). The brief characterizes the results of these reports in an accessible manner - that the difference in estimated teaching effectiveness between advantaged and disadvantaged students varied quite widely between districts, but overall is about four percent of the achievement gap in reading and 2-3 percent in math.

    Some observers were not impressed. They wondered why so-called reformers are alienating teachers and hurting students in order to address a mere 2-4 percent improvement in the achievement gap.

    Just to be clear, the 2-4 percent figures describe the gap (and remember that it varies). Whether it can be narrowed or closed – e.g., by improving working conditions or offering incentives or some other means – is a separate issue. Nevertheless, let’s put aside all the substantive aspects surrounding these studies, and the issue of the distribution of teacher quality, and discuss this 2-4 percent thing, as it illustrates what I believe is the among the most important tensions underlying education policy today: Our collective failure to have a reasonable debate about expectations and the power of education policy.

  • Revisiting The Widget Effect

    In 2009, The New Teacher Project (TNTP) released a report called “The Widget Effect." You would be hard-pressed to find too many more recent publications from an advocacy group that had a larger influence on education policy and the debate surrounding it. To this day, the report is mentioned regularly by advocates and policy makers.

    The primary argument of the report was that teacher performance “is not measured, recorded, or used to inform decision making in any meaningful way." More specifically, the report shows that most teachers received “satisfactory” or equivalent ratings, and that evaluations were not tied to most personnel decisions (e.g., compensation, layoffs, etc.). From these findings and arguments comes the catchy title – a “widget” is a fictional product commonly used in situations (e.g., economics classes) where the product doesn’t matter. Thus, treating teachers like widgets means that we treat them all as if they’re the same.

    Given the influence of “The Widget Effect," as well as how different the teacher evaluation landscape is now compared to when it was released, I decided to read it closely. Having done so, I think it’s worth discussing a few points about the report.

  • When Checking Under The Hood Of Overall Test Score Increases, Use Multiple Tools

    When looking at changes in testing results between years, many people are (justifiably) interested in comparing those changes for different student subgroups, such as those defined by race/ethnicity or income (subsidized lunch eligibility). The basic idea is to see whether increases are shared between traditionally advantaged and disadvantaged groups (and, often, to monitor achievement gaps).

    Sometimes, people take this a step further by using the subgroup breakdowns as a crude check on whether cross-sectional score changes are due to changes in the sample of students taking the test. The logic is as follows: If the increases are found when comparing advantaged and more disadvantaged cohorts, then an overall increase cannot be attributed to a change in the backgrounds of students taking the test, as the subgroups exhibited the same pattern. (For reasons discussed here many times before, this is a severely limited approach.)

    Whether testing data are cross-sectional or longitudinal, these subgroup breakdowns are certainly important and necessary, but it's wise to keep in mind that standard variables, such as eligibility for free and reduced-price lunches (FRL), are imperfect proxies for student background (actually, FRL rates aren't even such a great proxy for income). In fact, one might reach different conclusions depending on which variables are chosen. To illustrate this, let’s take a look at results from the Trial Urban District Assessment (TUDA) for the District of Columbia Public Schools between 2011 and 2013, in which there was a large overall score change that received a great deal of media attention, and break the changes down by different characteristics.