• Pay Equity In Higher Education

    Blatant forms of discrimination against women in academia have diminished since the Equal Pay Act and Title IX became law in 1964 and 1972, respectively. Yet gender differences in salary, tenure status, and leadership roles still persist among men and women in higher education. In particular, wage differences among male and female professors have not been fully explained, even when productivity, teaching experience, institutional size and prestige, disciplinary fields, type of appointment, and family-related responsibilities are controlled for statistically (see here).

    Scholars have argued that the “unexplained” gender wage gap is a function of less easily quantifiable (supply-type) factors, such as preferences and career aspirations, professional networks, etc. In fact, there is extensive evidence that both supply-side (e.g., career choices) and demand-side factors (e.g., employer discrimination) are shaped by broadly shared (often implicit) schemas about what men and women can and should do (a.k.a. descriptive and prescriptive gender stereotypes – see here)

    Regardless of the causes, which are clearly complex and multi-faceted, the fact remains that the salary advantage held by male faculty over female faculty exists across institutions and has changed very little over the past twenty-five years (see here). How big is this gap, exactly?

  • Ohio's New School Rating System: Different Results, Same Flawed Methods

    Without question, designing school and district rating systems is a difficult task, and Ohio was somewhat ahead of the curve in attempting to do so (and they're also great about releasing a ton of data every year). As part of its application for ESEA waivers, the state recently announced a newly-designed version of its long-standing system, with the changes slated to go into effect in 2014-15. State officials told reporters that the new scheme is a “more accurate reflection of … true [school and district] quality."

    In reality, however, despite its best intentions, what Ohio has done is perpetuate a troubled system by making less-than-substantive changes that seem to serve the primary purpose of giving lower grades to more schools in order for the results to square with preconceptions about the distribution of “true quality." It’s not a better system in terms of measurement - both the new and old schemes consist of mostly the same inappropriate components, and the ratings differentiate schools based largely on student characteristics rather than school performance.

    So, whether or not the aggregate results seem more plausible is not particularly important, since the manner in which they're calculated is still deeply flawed. And demonstrating this is very easy.

  • An Uncertain Time For One In Five Female Workers

    It’s well-known that patterns of occupational sex segregation in the labor market – the degree to which men and women are concentrated in certain occupations – have changed quite a bit over the past few decades, along with the rise of female labor force participation.

    Nevertheless, this phenomenon is still a persistent feature of the U.S. labor market (and those in other nations as well). There are many reasons for this, institutional, cultural and historical. But it’s interesting to take a quick look at a few specific groups, as there are implications in our current policy environment.

    The simple graph below presents the proportion of all working men and women that fall into three different occupational groups. The data are from the Bureau of Labor Statistics, and they apply to 2011.

  • If Your Evidence Is Changes In Proficiency Rates, You Probably Don't Have Much Evidence

    Education policymaking and debates are under constant threat from an improbable assailant: Short-term changes in cross-sectional proficiency rates.

    The use of rate changes is still proliferating rapidly at all levels of our education system. These measures, which play an important role in the provisions of No Child Left Behind, are already prominent components of many states’ core accountability systems (e..g, California), while several others will be using some version of them in their new, high-stakes school/district “grading systems." New York State is awarding millions in competitive grants, with almost half the criteria based on rate changes. District consultants issue reports recommending widespread school closures and reconstitutions based on these measures. And, most recently, U.S. Secretary of Education Arne Duncan used proficiency rate increases as “preliminary evidence” supporting the School Improvement Grants program.

    Meanwhile, on the public discourse front, district officials and other national leaders use rate changes to “prove” that their preferred reforms are working (or are needed), while their critics argue the opposite. Similarly, entire charter school sectors are judged, up or down, by whether their raw, unadjusted rates increase or decrease.

    So, what’s the problem? In short, it’s that year-to-year changes in proficiency rates are not valid evidence of school or policy effects. These measures cannot do the job we’re having them do, even on a limited basis. This really has to stop.

  • The Uses (And Abuses?) Of Student Data

    Knewton, a technology firm founded in 2008, has developed an “adaptive learning platform” that received significant media attention (also here, here, here and here), as well as funding and recognition early last fall and, again, in February this year (here and here). Although the firm is not alone in the adaptive learning game – e.g., Dreambox, Carnegie Learning – Knewton’s partnership with Pearson puts the company in a whole different league.

    Adaptive learning takes advantage of student-generated information; thus, important questions about data use and ownership need to be brought to the forefront of the technology debate.

    Adaptive learning software adjusts the presentation of educational content to students' needs, based on students’ prior responses to such content. In the world of research, such ‘prior responses’ would count and be treated as data. To the extent that adaptive learning is a mechanism for collecting information about learners, questions about privacy, confidentiality and ownership should be addressed.

  • Learning From Teach For America

    There is a small but growing body of evidence about the (usually test-based) effectiveness of teachers from Teach for America (TFA), an extremely selective program that trains and places new teachers in mostly higher needs schools and districts. Rather than review this literature paper-by-paper, which has already been done by others (see here and here), I’ll just give you the super-short summary of the higher-quality analyses, and quickly discuss what I think it means.*

    The evidence on TFA teachers focuses mostly on comparing their effect on test score growth vis-à-vis other groups of teachers who entered the profession via traditional certification (or through other alternative routes). This is no easy task, and the findings do vary quite a bit by study, as well as by the group to which TFA corps members are compared (e.g., new or more experienced teachers). One can quibble endlessly over the methodological details (and I’m all for that), and this area is still underdeveloped, but a fair summary of these papers is that TFA teachers are no more or less effective than comparable peers in terms of reading tests, and sometimes but not always more effective in math (the differences, whether positive or negative, tend to be small and/or only surface after 2-3 years). Overall, the evidence thus far suggests that TFA teachers perform comparably, at least in terms of test-based outcomes.

    Somewhat in contrast with these findings, TFA has been the subject of both intensive criticism and fawning praise. I don’t want to engage this debate directly, except to say that there has to be some middle ground on which a program that brings talented young people into the field of education is not such a divisive issue. I do, however, want to make a wider point specifically about the evidence on TFA teachers – what it might suggest about the current focus to “attract the best people” to the profession.

  • Beware Of Anecdotes In The Value-Added Debate

    A recent New York Times "teacher diary" presents the compelling account of a New York City teacher whose value-added rating was 6th percentile in 2009 – one of the lowest scores in the city – and 96th percentile the following year, one of the highest. Similar articles - for example, about teachers with errors in their rosters or scores that conflict with their colleagues'/principals' opinions - have been published since the release of the city’s teacher data reports (also see here). These accounts provoke a lot of outrage and disbelief, and that makes sense – they can sound absurd.

    Stories like these can be useful as illustrations of larger trends and issues - in this case, of the unfairness of publishing the NYC scores,  most of which are based on samples that are too small to provide meaningful information. But, in the debate over using these estimates in actual policy, we need to be careful not to focus too much on anecdotes. For every one NYC teacher whose value-added rank changed over 90 points between 2009 and 2010, there are almost 100 teachers whose ranks were within 10 points (and percentile ranks overstate the actual size of all these differences). Moreover, even if the models yielded perfect measures of test-based teacher performance, there would still be many implausible fluctuations between years - those that are unlikely to be "real" change - due to nothing more than random error.*

    The reliability of value-added estimates, like that of all performance measures (including classroom observations), is an important issue, and is sometimes dismissed by supporters in a cavalier fashion. There are serious concerns here, and no absolute answers. But none of this can be examined or addressed with anecdotes.

  • Technology In Education: An Answer In Search Of A Problem?

    In a recent blog post, Larry Cuban muses about the enthusiasm of some superintendents, school board members, parents, and pundits for expensive, new technologies, such as “iPads, tablets, and 1:1 laptops."

    Without any clear evidence, they spend massively on the newest technology, expecting that “these devices will motivate students to work harder, gain more knowledge and skills, and be engaged in schooling." They believe such devices can help students develop the skills they will need in a 21st century labor market—and hope they will somehow help to narrow the achievement gap that has been widening between rich and poor.

    But, argues Cuban, for those school leaders “who want to provide credible answers to the inevitable question that decision-makers ask about the effectiveness of new devices, they might consider a prior question. What is the pressing or important problem to which an iPad is the solution?"

    Good question. Now, good enough? I am not so sure. It still implicitly assumes an iPad must be a solution to some-thing in education.

  • Dispatches From The Nexus Of Bad Research And Bad Journalism

    In a recent story, the New York Daily News uses the recently-released teacher data reports (TDRs) to “prove” that the city’s charter school teachers are better than their counterparts in regular public schools. The headline announces boldly: New York City charter schools have a higher percentage of better teachers than public schools (it has since been changed to: "Charters outshine public schools").

    Taking things even further, within the article itself, the reporters note, “The newly released records indicate charters have higher performing teachers than regular public schools."

    So, not only are they equating words like “better” with value-added scores, but they’re obviously comfortable drawing conclusions about these traits based on the TDR data.

    The article is a pretty remarkable display of both poor journalism and poor research. The reporters not only attempted to do something they couldn’t do, but they did it badly to boot. It’s unfortunate to have to waste one’s time addressing this kind of thing, but, no matter your opinion on charter schools, it's a good example of how not to use the data that the Daily News and other newspapers released to the public.

  • The Charter School Authorization Theory

    Anyone who wants to start a charter school must of course receive permission, and there are laws and policies governing how such permission is granted. In some states, multiple entities (mostly districts) serve as charter authorizers, whereas in others, there is only one or very few. For example, in California there are almost 300 entities that can authorize schools, almost all of them school districts. In contrast, in Arizona, a state board makes all the decisions.

    The conventional wisdom among many charter advocates is that the performance of charter schools depends a great deal on the “quality” of authorization policies – how those who grant (or don’t renew) charters make their decisions. This is often the response when supporters are confronted with the fact that charter results are varied but tend to be, on average, no better or worse than those of regular public schools. They argue that some authorization policies are better than others, i.e., bad processes allow some poorly-designed schools start, while failing to close others.

    This argument makes sense on the surface, but there seems to be scant evidence on whether and how authorization policies influence charter performance. From that perspective, the authorizer argument might seem a bit like tautology – i.e., there are bad schools because authorizers allow bad schools to open, and fail to close them. As I am not particularly well-versed in this area, I thought I would look into this a little bit.