• New Evidence On Teaching Quality And The Achievement Gap

    It is an extensively documented fact that low-income students score more poorly on standardized tests than do their higher income peers. This so-called “achievement gap” has persisted for generations and is still one of the most significant challenges confronting the American educational system.

    Some people tend to overstate -- while others tend to understate -- the degree to which this gap is attributable to differences in teacher (and school) effectiveness between lower and higher income students (with income usually defined in terms of students’ eligibility for subsidized lunch assistance). As discussed below, the evidence thus far suggests that lower income students are a more likely than higher income students to have less “effective” teachers -- with effectiveness defined in terms of the ability to help raise student test scores, or value-added, although the magnitude of these discrepancies varies by study. There are also some compelling theories as to the possible mechanisms behind these (often modest) discrepancies, most notably the fact that schools in low-income neighborhoods tend to have fewer resources, as well as more trouble recruiting and retaining highly qualified, experienced teachers.

    The Mathematica Policy Research organization recently released a very large, very important study that addresses these issues directly. It focuses on shedding additional light on the magnitude of any measurable differences in access to effective teaching among students of different incomes (the “Effective Teaching Gap”), as well as the way in which hiring, mobility, and retention might contribute to these gaps. The analysis uses data on teachers in grades 4-8 or 6-8 (depending on data availability) over five years (2008-09 to 2012-13) in 26 districts across the nation.

  • When Our Teachers Learn, Our Students Learn

    Our guest authors today are Mark D. Benigni, Ed. D., Superintendent of the Meriden Public Schools in Connecticut and co-chairperson of the Connecticut Association of Urban Superintendents, as well as Erin Benham, President of the Meriden Federation of Teachers and a member of the Connecticut State Department of Education Board of Directors. The authors seek to understand how teacher learning improves student learning outcomes. 

    Our students’ success and ability to graduate college and career ready from our public schools must be society's primary educational objective. The challenge lies in how we create neighborhood public schools where student learning and teacher learning are valued and supported. How do we assure our students' and staff's satisfaction and growth? And, in essence, how do we create schools where students and staff want to be?

    Around the country, some districts are opting for market-based reforms such as privately supported charter schools or online school options. In Meriden we took a different approach and decided to collaborate as a springboard for innovation and improvement. The school district and teachers' union have been strong partners for almost seven years. Such trust and partnership has made possible the reforms that will be described in the rest of this post.

    Collaboration facilitated development of a weekly early-release day for Professional Learning Communities to meet. During this time, teachers review individual student academic data with their data teams. However, the paucity of non-academic information about students emerged as an important area of improvement. We launched a three-phased approach to address climate and culture in our schools. Our climate suite includes: a School Climate Survey completed by students, staff, and families; a Getting to Know You Survey completed by students in the spring, with results shared in the fall with receiving teachers; and a MPS Cares online portal for students to request assistance and support.

  • Do Subgroup Accountability Measures Affect School Ratings Systems?

    The school accountability provisions of No Child Left Behind (NCLB) institutionalized a focus on the (test-based) performance of student subgroups, such as English language learners, racial and ethnic groups, and students eligible for free- and reduced-price lunch (FRL). The idea was to shine a spotlight on achievement gaps in the U.S., and to hold schools accountable for serving all students.

    This was a laudable goal, and disaggregating data by student subgroups is a wise policy, as there is much to learn from such comparisons. Unfortunately, however, NCLB also institutionalized the poor measurement of school performance, and so-called subgroup accountability was not immune. The problem, which we’ve discussed here many times, is that test-based accountability systems in the U.S. tend to interpret how highly students score as a measure of school performance, when it is largely a function of factors out of schools' control, such as student background. In other words, schools (or subgroups of those students) may exhibit higher average scores or proficiency rates simply because their students entered the schools at higher levels, regardless of how effective the school may be in raising scores. Although NCLB’s successor, the Every Student Succeeds Act (ESSA), perpetuates many of these misinterpretations, it still represents some limited progress, as it encourages greater reliance on growth-based measures, which look at how quickly students progress while they attend a school, rather than how highly they score in any given year (see here for more on this).

    Yet this evolution, slow though it may be, presents a somewhat unique challenge for the inclusion of subgroup-based measures in formal school accountability systems. That is, if we stipulate that growth model estimates are the best available test-based way to measure school (rather than student) performance, how should accountability systems apply these models to traditionally lower scoring student subgroups?

  • Social And Emotional Skills In School: Pivoting From Accountability To Development

    Our guest authors today are David Blazar and Matthew A. Kraft. Blazar is a Lecturer on Education and Postdoctoral Research Fellow at Harvard Graduate School of Education and Kraft is an Assistant Professor of Education and Economics at Brown University.

    With the passage of the Every Student Succeeds Act (ESSA) in December 2015, Congress required that states select a nonacademic indicator with which to assess students’ success in school and, in turn, hold schools accountable. We believe that broadening what it means to be a successful student and school is good policy. Students learn and grow in multifaceted ways, only some of which are captured by standardized achievement tests. Measures such as students’ effort, initiative, and behavior also are key indicators for their long-term success (see here). Thus, by gathering data on students’ progress on a range of measures, both academic and what we refer to as “social and emotional” development, teachers and school leaders may be better equipped to help students improve in these areas.

    In the months following the passage of ESSA, questions about use of social and emotional skills in accountability systems have dominated the debate. What measures should districts use? Is it appropriate to use these measures in high-stakes setting if they are susceptible to potential biases and can be easily coached or manipulated? Many others have written about this important topic before us (see, for example, here, here, here, and here). Like some of them, we agree that including measures of students’ social and emotional development in accountability systems, even with very small associated weights, could serve as a strong signal that schools and educators should value and attend to developing these skills in the classroom. We also recognize concerns about the use of measures that really were developed for research purposes rather than large-scale high-stakes testing with repeated administrations.

  • A Few Reactions To The Final Teacher Preparation Accountability Regulations

    The U.S. Department of Education (USED) has just released the long-anticipated final regulations for teacher preparation (TP) program accountability. These regulations will guide states, which are required to design their own systems for assessing TP program performance for full implementation in 2018-19. The earliest year in which stakes (namely, eligibility for federal grants) will be attached to the ratings is 2021-22.

    Among the provisions receiving attention is the softening of the requirement regarding the use of test-based productivity measures, such as value-added and other growth models (see Goldhaber et al. 2013; Mihaly et al. 2013; Koedel et al. 2015). Specifically, the final regulations allow greater “flexibility” in how and how much these indicators must count toward final ratings. For the reasons that Cory Koedel and I laid out in this piece (and I will not reiterate here), this is a wise decision. Although it is possible that value-added estimates will eventually play a significant role in these TP program accountability systems, the USED timeline provides insufficient time for the requisite empirical groundwork.

    Yet this does not resolve the issues facing those who must design these systems, since putting partial brakes on value-added for TP programs also puts increased focus on the other measures which might be used to gauge program performance. And, as is often the case with formal accountability systems, the non-test-based bench is not particularly deep.

  • Building A Professional Network Of Rural Educators From Scratch

    Our guest author today is Danette Parsley, Chief Program Officer at Education Northwest, where she leads initiatives like the Northwest Rural Innovation and Student Engagement Network. To learn more about this work, check out Designing Rural School Improvement Networks: Aspirations and Actualities and Generating Opportunity and Prosperity: The Promise of Rural Education Collaboratives.

    Small rural schools draw from a deep well of assets to positively impact student experiences and outcomes. They tend to serve as central hubs within their communities, and their small size often facilitates close staff relationships, which in turn can enable moving innovative ideas into action. At the same time, rural schools face a number of challenges that differ from those of their urban and suburban counterparts.

    First, it’s extremely difficult to draw high-quality teachers to geographically disconnected, rural communities—and, when they do come, it’s hard to get them to stay. Second, it’s a challenge to connect teachers across remote and rural communities so they can share instructional practices and professional development. One way to address the challenges facing rural schools, while leveraging their inherent assets, is to establish professional networks of teacher leaders aimed at providing support that helps their colleagues succeed and encourages them to stay.

  • Economic Segregation In New York City Schools

    Although student segregation by race and ethnicity is well documented in U.S. public schools, the body of evidence on the related outcome of economic school segregation (e.g., by income) is considerably smaller (Reardon and Owens 2014).

    In general, economic segregation of students is increasing nationally over the past few decades, both between districts and between schools (Owens et al. 2014). It is inevitable that these aggregate trends vary widely by state, metropolitan area, and district. We were curious as to the situation in New York City, the nation’s largest district, but were unable to find any NYC-specific results, particularly results that included different types of segregation measures.

    We therefore decided to take a quick look ourselves, using data from the NYC Department of Education. The very brief analysis below uses eligibility for subsidized lunch (free and reduced-price lunch, or FRL) as a (very) rough income proxy, and segregation is measured between district schools only (charters are not included) from 2002 to 2015. In the graph below, we characterize within-district, between-school segregation using two different and very common approaches, exposure and dissimilarity.

  • The Details Matter In Teacher Evaluations

    Throughout the process of reforming teacher evaluation systems over the past 5-10 years, perhaps the most contentious, discussed issue was the importance, or weights, assigned to different components. Specifically, there was a great deal of debate about the proper weight to assign to test-based teacher productivity measures, such estimates from value-added and other growth models.

    Some commentators, particularly those more enthusiastic about test-based accountability, argued that the new teacher evaluations somehow were not meaningful unless value-added or growth model estimates constituted a substantial proportion of teachers’ final evaluation ratings. Skeptics of test-based accountability, on the other hand, tended toward a rather different viewpoint – that test-based teacher performance measures should play little or no role in the new evaluation systems. Moreover, virtually all of the discussion of these systems’ results, once they were finally implemented, focused on the distribution of final ratings, particularly the proportions of teachers rated “ineffective.”

    A recent working paper by Matthew Steinberg and Matthew Kraft directly addresses and informs this debate. Their very straightforward analysis shows just how consequential these weighting decisions, as well as choices of where to set the cutpoints for final rating categories (e.g., how many points does a teacher need to be given an “effective” versus “ineffective” rating), are for the distribution of final ratings.

  • An Alternative Income Measure Using Administrative Education Data

    The relationship between family background and educational outcomes is well documented and the topic, rightfully, of endless debate and discussion. A students’ background is most often measured in terms of family income (even though it is actually the factors associated with income, such as health, early childhood education, etc., that are the direct causal agents).

    Most education analyses rely on a single income/poverty indicator – i.e., whether or not students are eligible for federally-subsidized lunch (free/reduced-price lunch, or FRL). For instance, income-based achievement gaps are calculated by comparing test scores between students who are eligible for FRL and those who are not, while multivariate models almost always use FRL eligibility as a control variable. Similarly, schools and districts with relatively high FRL eligibility rates are characterized as “high poverty.” The primary advantages of FRL status are that it is simple and collected by virtually every school district in the nation (collecting actual income would not be feasible). Yet it is also a notoriously crude and noisy indicator. In addition to the fact that FRL eligibility is often called “poverty” even though the cutoff is by design 85 percent higher than the federal poverty line, FRL rates, like proficiency rates, mask a great deal of heterogeneity. Families of two students who are FRL eligible can have quite different incomes, as could two families of students who are not eligible. As a result, FRL-based estimates such as achievement gaps might differ quite a bit from those calculated using actual family income from surveys.

    A new working paper by Michigan researchers Katherine Michelmore and Susan Dynarski presents a very clever means of obtaining a more accurate income/poverty proxy using the same administrative data that states and districts have been collecting for years.

  • Contingent Faculty At U.S. Colleges And Universities

    In a previous post, we discussed the prevalence of and trends in alternative employment arrangements, sometimes called “contingent work,” in the U.S. labor market. Contingent work is jobs with employment arrangements other than the “traditional” full-time model, including workers with temporary contracts, independent contractors, day laborers, and part-time employees.

    Depending on how one defines this group of workers, who are a diverse group but tend to enjoy less job stability and lower compensation, they comprise anywhere between 10 and 40 percent of the U.S. workforce, and this share increased moderately between 2000 and 2010. Of course, how many contingents there are, and how this has changed over time, varies quite drastically by industry, as well as by occupation. For example, in 1990, around 28 percent of staffing services employees (sometimes called “temps”) worked in blue collar positions, while 42 percent had office jobs. By 2009, these proportions had reversed, with 41 percent of temps in blue collar jobs and 23 percent doing office work. This is a pretty striking change.

    Another industry/occupation in which there has been significant short term change in the contingent work share is among faculty and instructors in higher education institutions.