Skip to:

Teacher Evaluation

  • Why Teacher Evaluation Reform Is Not A Failure

    Written on August 23, 2018

    The RAND Corporation recently released an important report on the impact of the Gates Foundation’s “Intensive Partnerships for Effective Teaching” (IPET) initiative. IPET was a very thorough and well-funded attempt to improve teaching quality in schools in three districts and four charter management organizations (CMOs). The initiative was multi-faceted, but its centerpiece was the implementation of multi-measure teacher evaluation systems and the linking of ratings from those systems to professional development and high stakes personnel decisions, including compensation, tenure, and dismissal. This policy, particularly the inclusion in teacher evaluations of test-based productivity measures (e.g., value-added scores), has been among the most controversial issues in education policy throughout the past 10 years.

    The report is extremely rich and there's a lot of interesting findings in there, so I would encourage everyone to read it themselves (at least the executive summary), but the headline finding was that the IPET had no discernible effect on student outcomes, namely test scores and graduation rates, in the districts that participated, vis-à-vis similar districts that did not. Given that IPET was so thoroughly designed and implemented, and that it was well-funded, it can potentially be viewed as a "best case scenario" test of the type of evaluation reform that most states have enacted. Accordingly, critics of these reforms, who typically focus their opposition on the high stakes use of evaluation measures, particularly value-added and other test-based measures, in these evaluations, have portrayed the findings as vindication of their opposition. 

    This reaction has merit. The most important reason why is that evaluation reform was portrayed by advocates as a means to immediate and drastic improvements in student outcomes. This promise was misguided from the outset, and evaluation reform opponents are (and were) correct in pointing this out. At the same time, however, it would be wise not to dismiss evaluation reform as a whole, for several reasons, a few of which are discussed below.

    READ MORE
  • What Happened To Teacher Quality?

    Written on March 15, 2018

    Starting around 2005 and up until a few years ago, education policy discourse and policymaking was dominated by the issue of improving “teacher quality.” We don’t really hear too much about it the past couple of years, or at least not nearly as much. One of the major reasons why is that the vast majority of states have enacted policies ostensibly designed to improve teacher quality.

    Thanks in no small part to the Race to the Top grant program, and the subsequent ESEA waiver program, virtually all states reformed their teacher evaluation systems, the “flagship” policy of the teacher quality push. Many of these states also tied their new evaluation results to high stakes personnel decisions, such as granting tenure, dismissals, layoffs, and compensation. Predictably, the details of these new systems vary quite a bit, both within and between states. Many advocates are unsatisfied with how the new policies were designed, and one could write a book on all the different issues. Yet it would be tough to deny that this national policy effort was among the fastest shifts in recent educational history, particularly given the controversy surrounding it.

    So, what happened to all the attention to teacher quality? It was put into practice. The evidence on its effects is already emerging, but this will take a while, and so it is still a quiet time in teacher quality land, at least compared to the previous 5-7 years. Even so, there are already many lessons out there, too many for a post. Looking back, though, one big picture lesson – and definitely not a new one – is about how the evaluation reform effort stands out (in a very competitive field) for the degree to which it was driven by the promise of immediate, large results.

    READ MORE
  • Teacher Evaluations And Turnover In Houston

    Written on March 30, 2017

    We are now entering a time period in which we might start to see a lot of studies released about the impact of new teacher evaluations. This incredibly rapid policy shift, perhaps the centerpiece of the Obama Administration’s education efforts, was sold based on illustrations of the importance of teacher quality.

    The basic argument was that teacher effectiveness is perhaps the most important factor under schools’ control, and the best way to improve that effectiveness was to identify and remove ineffective teachers via new teacher evaluations. Without question, there was a logic to this approach, but dismissing or compelling the exits of low performing teachers does not occur in a vacuum. Even if a given policy causes more low performers to exit, the effects of this shift can be attenuated by turnover among higher performers, not to mention other important factors, such as the quality of applicants (Adnot et al. 2016).

    A new NBER working paper by Julie Berry Cullen, Cory Koedel, and Eric Parsons, addresses this dynamic directly by looking at the impact on turnover of a new evaluation system in Houston, Texas. It is an important piece of early evidence on one new evaluation system, but the results also speak more broadly to how these systems work.

    READ MORE
  • New Teacher Evaluations And Teacher Job Satisfaction

    Written on February 15, 2017

    Job satisfaction among teachers is a perenially popular topic of conversation in education policy circles. There is good reason for this. For example, whether or not teachers are satisfied with their work has been linked to their likelihood of changing schools or professions (e.g., Ingersoll 2001).

    Yet much of the discussion of teacher satisfaction consists of advocates’ speculation that their policy preferences will make for a more rewarding profession, whereas opponents’ policies are sure to disillusion masses of educators. This was certainly true of the debate surrounding the rapid wave of teacher evaluation reform over the past ten or so years.

    A paper just published in the American Education Research Journal addresses directly the impact of new evaluation systems on teacher job satisfaction. It is, therefore, not only among the first analyses to examine the impact of these systems, but also the first to look at their effect on teachers’ attitudes.

    READ MORE
  • Social And Emotional Skills In School: Pivoting From Accountability To Development

    Written on October 25, 2016

    Our guest authors today are David Blazar and Matthew A. Kraft. Blazar is a Lecturer on Education and Postdoctoral Research Fellow at Harvard Graduate School of Education and Kraft is an Assistant Professor of Education and Economics at Brown University.

    With the passage of the Every Student Succeeds Act (ESSA) in December 2015, Congress required that states select a nonacademic indicator with which to assess students’ success in school and, in turn, hold schools accountable. We believe that broadening what it means to be a successful student and school is good policy. Students learn and grow in multifaceted ways, only some of which are captured by standardized achievement tests. Measures such as students’ effort, initiative, and behavior also are key indicators for their long-term success (see here). Thus, by gathering data on students’ progress on a range of measures, both academic and what we refer to as “social and emotional” development, teachers and school leaders may be better equipped to help students improve in these areas.

    In the months following the passage of ESSA, questions about use of social and emotional skills in accountability systems have dominated the debate. What measures should districts use? Is it appropriate to use these measures in high-stakes setting if they are susceptible to potential biases and can be easily coached or manipulated? Many others have written about this important topic before us (see, for example, here, here, here, and here). Like some of them, we agree that including measures of students’ social and emotional development in accountability systems, even with very small associated weights, could serve as a strong signal that schools and educators should value and attend to developing these skills in the classroom. We also recognize concerns about the use of measures that really were developed for research purposes rather than large-scale high-stakes testing with repeated administrations.

    READ MORE
  • The Details Matter In Teacher Evaluations

    Written on September 22, 2016

    Throughout the process of reforming teacher evaluation systems over the past 5-10 years, perhaps the most contentious, discussed issue was the importance, or weights, assigned to different components. Specifically, there was a great deal of debate about the proper weight to assign to test-based teacher productivity measures, such estimates from value-added and other growth models.

    Some commentators, particularly those more enthusiastic about test-based accountability, argued that the new teacher evaluations somehow were not meaningful unless value-added or growth model estimates constituted a substantial proportion of teachers’ final evaluation ratings. Skeptics of test-based accountability, on the other hand, tended toward a rather different viewpoint – that test-based teacher performance measures should play little or no role in the new evaluation systems. Moreover, virtually all of the discussion of these systems’ results, once they were finally implemented, focused on the distribution of final ratings, particularly the proportions of teachers rated “ineffective.”

    A recent working paper by Matthew Steinberg and Matthew Kraft directly addresses and informs this debate. Their very straightforward analysis shows just how consequential these weighting decisions, as well as choices of where to set the cutpoints for final rating categories (e.g., how many points does a teacher need to be given an “effective” versus “ineffective” rating), are for the distribution of final ratings.

    READ MORE
  • Teachers' Opinions Of Teacher Evaluation Systems

    Written on June 17, 2016

    The primary test of the new teacher evaluation systems implemented throughout the nation over the past 5-10 years is whether they improve teacher and ultimately student performance. Although the kinds of policy evaluations that will address these critical questions are just beginning to surface (e.g., Dee and Wyckoff 2015), among the most important early indicators of how well the new systems are working is their credibility among educators. Put simply, if teachers and administrators don’t believe in the systems, they are unlikely to respond productively to them.

    A new report from the Institute of Education Sciences (IES) provides a useful little snapshot of teachers’ opinions of their evaluation systems using a nationally representative survey. It is important to bear in mind that the data are from the 2011-12 Schools and Staffing Survey (SASS) and the 2012-13 Teacher Follow Up Survey, a time in which most of the new evaluations in force today were either still on the drawing board, or in their first year or two of implementation. But the results reported by IES might still serve as a useful baseline going forward.

    The primary outcome in this particular analysis is a survey item querying whether teachers were “satisfied” with their evaluation process. And almost four in five respondents either strongly or somewhat agreed that they were satisfied with their evaluation. Of course, satisfaction with an evaluation system does not necessarily signal anything about its potential to improve or capture teacher performance, but it certainly tells us something about teachers’ overall views of how they are evaluated.

    READ MORE
  • Getting Serious About Measuring Collaborative Teacher Practice

    Written on April 8, 2016

    Our guest author today is Nathan D. Jones, an assistant professor of special education at Boston University. His research focuses on teacher quality, teacher development, and school improvement. Dr. Jones previously worked as a middle school special education teacher in the Mississippi Delta. In this column, he introduces a new Albert Shanker Institute publication, which was written with colleagues Elizabeth Bettini and Mary Brownell.

    The current policy landscape presents a dilemma. Teacher evaluation has dominated recent state and local reform efforts, resulting in broad changes in teacher evaluation systems nationwide. The reforms have spawned countless research studies on whether emerging evaluation systems use measures that are reliable and valid, whether they result in changes in how teachers are rated, what happens to teachers who receive particularly high or low ratings, and whether the net results of these changes have had an effect on student learning.

    At the same time,  there has been increasing enthusiasm about the promise of teacher collaboration (see here and here), spurred in part by new empirical evidence linking teacher collaboration to student outcomes (see Goddard et al., 2007; Ronfeldt, 2015; Sun, Grissom, & Loeb, 2016). When teachers work together, such as when they jointly analyze student achievement data (Gallimore et al., 2009; Saunders, Gollenberg, & Gallimore, 2009) or when high-performing teachers are matched with low-performing peers (Papay, Taylor, Tyler, & Laski, 2016), students have shown substantially better growth on standardized tests.

    This new work adds to a long line of descriptive research on the importance of colleagues and other social aspects of the school organization.  Research has documented that informal relationships with colleagues play an important role in promoting positive teacher outcomes, such as planned and actual retention decisions (e.g., Bryk & Schneider, 2002; Pogodzisnki, Youngs, & Frank, 2013; Youngs, Pogodzinski, Grogan, & Perrone, 2015). Further, a number of initiatives aimed at improving teacher learning – e.g., professional learning communities (Giles & Hargreaves, 2006) and lesson study (Lewis, Perry, & Murrata, 2006) – rely on teachers planning instruction collaboratively.

    READ MORE
  • Evaluating The Results Of New Teacher Evaluation Systems

    Written on March 24, 2016

    A new working paper by researchers Matthew Kraft and Allison Gilmour presents a useful summary of teacher evaluation results in 19 states, all of which designed and implemented new evaluation systems at some point over the past five years. As with previous evaluation results, the headline result of this paper is that only a small proportion of teachers (2-5 percent) were given the low, “below proficiency” ratings under the new systems, and the vast majority of teachers continue to be rated as satisfactory or better.

    Kraft and Gilmour present their results in the context of the “Widget Effect,” a well-known 2009 report by the New Teacher Project showing that the overwhelming majority of teachers in the 12 districts for which they had data received “satisfactory” ratings. The more recent results from Kraft and Gilmour indicate that this hasn’t changed much due to the adoption of new evaluation systems, or, at least, not enough to satisfy some policymakers and commentators who read the paper.

    The paper also presents a set of findings from surveys of and interviews with observers (e.g., principals). These are in many respects more interesting and important results from a research and policy perspective, but let’s nevertheless focus a bit on the findings on the distribution of teachers across rating categories, as they caused a bit of a stir. I have several comments to make about them, but will concentrate on three in particular (all of which, by the way, pertain not to the paper’s discussion, which is cautious and thorough, but rather to some of the reaction to it in our education policy discourse).

    READ MORE
  • Student Sorting And Teacher Classroom Observations

    Written on February 25, 2016

    Although value added and other growth models tend to be the focus of debates surrounding new teacher evaluation systems, the widely known but frequently unacknowledged reality is that most teachers don’t teach in the tested grades and subjects, and won’t even receive these test-based scores. The quality and impact of the new systems therefore will depend heavily upon the quality and impact of other measures, primarily classroom observations.

    These systems have been in use for decades, and yet, until recently, relatively little is known about their properties, such as their association with student and teacher characteristics, and there are, as yet, only a handful of studies of their impact on teachers’ performance (e.g., Taylor and Tyler 2012). The Measures of Effective Teaching (MET) Project, conducted a few years ago, was a huge step forward in this area, though at the time it was perhaps underappreciated the degree to which MET’s contribution was not just in the (very important) reports it produced, but also in its having collected an extensive dataset for researchers to use going forward. A new paper, just published in Educational Evaluation and Policy Analysis, is among the many analyses that have and will use MET data to address important questions surrounding teacher evaluation.

    The authors, Rachel Garrett and Matthew Steinberg, look at classroom observation scores, specifically those from Charlotte Danielson’s widely employed Framework for Teaching (FFT) protocol. These results are yet another example of how observation scores share most of the widely-cited (statistical) criticisms of value added scores, most notably their sensitivity to which students are assigned to teachers.

    READ MORE

Pages

Subscribe to Teacher Evaluation

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.