Teacher performance, compensation and measurement

One of the core questions in education is how to properly measure teacher performance and structure personnel and compensation plans accordingly. Among the controversial methods are teacher evaluation systems that, to a certain extent, factor in student test scores in ratings of teacher performance. Such data-driven systems have been proposed across the country, sometimes resulting in contentious battles with teachers’ unions. Critics of such reforms point out the shortcomings of testing and also voice concerns that lower-income and minority schools will be put at a disadvantage as talented educators and students move to wealthier districts and charter schools.

Several concurrent societal trends are raising the stakes. These factors make the issue of teacher performance difficult to fit into an easy narrative, as local education policy debates are inextricably tied to larger political and economic patterns. A 2012 study in the American Educational Research Journal, “How Teacher Turnover Harms Student Achievement,” outlines the extent to which this phenomenon of “quick exits” by educators hurts young people, particularly low-performing and African-American students.

Research data has shown that teachers can make a huge difference in the lives and life outcomes of children. Teaching has also become more important as U.S. policymakers focus on readying students for an increasingly competitive global economy. Research continues to raise serious questions about America’s comparative educational outcomes; growth trends in U.S. performance have remained modest relative to other countries. In addition, the use of the testing instruments since the passage of No Child Left Behind has shown only mixed success, as a 2011 report from the National Research Council suggests.

In this era of bare municipal coffers, school administrators are often driven to assess teacher value to ensure maximum return on investment; the issue of pay for performance is tied to this dynamic. At the same time, rising wage inequality has affected workers across the American labor market, partly because union membership has diminished substantially since the mid-1970s. Research shows that the decline of organized labor explains one third of the increase in wage inequality among men and one fifth of increased wage inequality among women.

The following are recent studies that address issues of teacher compensation, measurement and performance.

______

“Accountability Under Constraint: The Relationship Between Collective Bargaining Agreements and California Schools’ and Districts’ Performance Under No Child Left Behind”
Strunk, Katharine O.; McEachin, Andrew. American Educational Research Journal, August 2011, Vol. 48, No. 4, 871-903. doi: 10.3102/0002831211401006.

Findings: “Contract rigidity is more associated with reduced outcomes in schools and districts with characteristics that proxy less desirable working conditions. We find no evidence that suggests that stronger contracts are specifically associated with lower proficiency and graduation rates for most NCLB-classified subgroups, including Black, White, low-income, special needs, and English language learner groups. We do find some evidence that contract restrictiveness is associated with lower proficiency and graduation rates for Hispanic students…. It may be something about the kinds of districts with higher proportions of low-income and minority students that interacts with contract rigidity to reduce student achievement and schools’ and districts’ abilities to meet NCLB standards. In addition… the amount of trust or goodwill between the district administration and the teachers’ union or teaching staff may impact both contract strength and student outcomes.”

“An Ocean of Unknowns: Risks and Opportunities in Using Student Achievement Data to Evaluate PreK-3rd Grade Teachers”
Bornfreuend, Laura. Education Policy Program, New America Foundation. May 15, 2013.

Abtract : “Identifying approaches to measure student growth in the PreK-3rd grades is complex. There are distinct challenges for this set of teachers. The developmental growth of children in the early grades is directly linked to their academic growth. The paper-and-pencil tests used with older kids will not work with children ages three through eight. And measures of literacy and numeracy alone do not allow for a full picture of a young child’s learning or his teacher’s impact—in fact, many experts would argue this is also the case for grades 3 -12. Yet we cannot forget the potential of new teacher evaluation systems: to improve teaching and learning by ensuring that every child has access to an effective teacher, to identify those teachers who are already helping children to achieve, and to provide constructive feedback and new courses of action for teachers who are not.”

“Identifying Effective Classroom Practices Using Student Achievement Data”
Kane, Thomas J.; Taylor, Eric S.; Tyler, John H.; Wooten, Amy L. NBER working paper No. 15803, March 2010.

Abstract: “Recent research has confirmed both the importance of teachers in producing student achievement growth and in the variability across teachers in the ability to do that. Such findings raise the stakes on our ability to identify effective teachers and teaching practices. This paper combines information from classroom-based observations and measures of teachers’ ability to improve student achievement as a step toward addressing these challenges. We find that classroom based measures of teaching effectiveness are related in substantial ways to student achievement growth. Our results point to the promise of teacher evaluation systems that would use information from both classroom observations and student test scores to identify effective teachers. Our results also offer information on the types of practices that are most effective at raising achievement.”

“Individual Teacher Incentives and Student Performance”
Figlio, David N.; Kenny, Lawrence W. Journal of Public Economics, June 2007, Vol. 91, No. 5-6, 901-914. doi:10.1016/j.jpubeco.2006.10.001.

Abstract: “This paper is the first to systematically document the relationship between individual teacher performance incentives and student achievement using the United States data. We combine data from the National Education Longitudinal Survey on schools, students, and their families with our own survey conducted in 2000 regarding the use of teacher incentives. This survey on teacher incentives has unique data on frequency and magnitude of merit raises and bonuses, teacher evaluation, and teacher termination. We find that test scores are higher in schools that offer individual financial incentives for good performance. Moreover, the estimated relationship between the presence of merit pay in teacher compensation and student test scores is strongest in schools that may have the least parental oversight. The association between teacher incentives and student performance could be due to better schools adopting teacher incentives or to teacher incentives eliciting more effort from teachers; it is impossible to rule out the former explanation with our cross sectional data.”

“Implicit Measurement of Teacher Quality: Using Performance on the Job to Inform Teacher Tenure Decisions”
Goldhaber, Dan; Hansen, Michael. American Economic Review: Papers & Proceedings, May 2010, Vol. 100, No. 2, 250-255.

Findings: “In this paper we explore the potential for using value added model (VAM) estimates as the primary criteria for rewarding teachers with tenure, a policy reform currently under consideration…. The evidence that observable teacher characteristics are only weakly related to teacher productivity makes effective teacher quality policies elusive, and has led some to call for using more direct measures of teacher performance to determine employment eligibility (or compensation). The evidence presented here shows that VAM measures of teacher effectiveness are stable enough that early career estimates of teacher effectiveness predict student achievement at least three years later, and do so far better than observable teacher characteristics. This finding lends credence to the notion that these implicit measures of teacher quality are a reasonable metric to use as a factor in making substantive personnel decisions.”

“Capturing the Dimensions of Effective Teaching”
Kane, Thomas J. Education Next, Fall 2012, Vol. 12, No. 1.

Findings: “All the measures are flawed in some way. Test-based student-achievement gains have predictive power but provide little insight into a teacher’s particular strengths and weaknesses. Classroom observations require multiple observations by multiple observers in order to provide a reliable image of a teacher’s practice. The student surveys, while being the most consistent of the three across different classrooms taught by the same teacher, were less predictive of student achievement gains than the achievement-gain measures themselves. Fortunately, the evaluation methods are stronger as a team than as individuals. First, combining them generates less volatility from course section to section or year to year, and greater predictive power…. A second reason to combine the measures is to reduce the risk of unintended consequences, to lessen the likelihood of manipulation or ‘gaming.’ Whenever one places all the stakes on any single measure, the risk of distortion and abuse goes up. For instance, if all the weight were placed on student test scores, then the risk of narrowing of the curriculum or cheating would rise. If all the weight were placed on student surveys (as happens in higher education), then instructors would be tempted to pander to students and students might be more drawn to play pranks on their teachers. If all the weight were placed on classroom observations, then instructors would be tempted to go through the motions of effective practice on the day of an observation but not on other days…. The use of multiple measures not only spreads the risk but also provides opportunities to detect manipulation or gaming. For example, if a teacher is spending a disproportionate amount of class time drilling children for the state assessments, a school system can protect itself by adding a question on test-preparation activities to the student survey. If a teacher behaves unusually on the day of the observation, then the student surveys and achievement gains may tell a different story.”

“Cross-Country Evidence on Teacher Performance Pay”
Woessmann, Ludger. Economics of Education Review, June 2011, Vol. 30, No. 3, 404-418. doi:10.1016/j.econedurev.2010.12.008.

Abstract: “The general-equilibrium effects of performance-related teacher pay include long-term incentive and teacher-sorting mechanisms that usually elude experimental studies but are captured in cross-country comparisons. Combining country-level performance-pay measures with rich PISA-2003 international achievement micro data, this paper estimates student-level international education production functions. The use of teacher salary adjustments for outstanding performance is significantly associated with math, science, and reading achievement across countries. Scores in countries with performance-related pay are about one quarter standard deviations higher. Results avoid bias from within-country selection and are robust to continental fixed effects and to controlling for non-performance based forms of teacher salary adjustments.”

“Enhancing the Efficacy of Teacher Incentives through Loss Aversion: A Field Experiment”
Fryer, Jr., Roland G.; Levitt, Steven D.; List, John; Sadoff, Sally. NBER working paper 18237, July 2012.

Abstract: “Domestic attempts to use financial incentives for teachers to increase student achievement have been ineffective. In this paper, we demonstrate that exploiting the power of loss aversion — teachers are paid in advance and asked to give back the money if their students do not improve sufficiently — increase math test scores…. A second treatment arm, identical to the loss aversion treatment implemented in the standard fashion, yields smaller and statistically insignificant results. This suggests it is loss aversion, rather than other features of the design or population sampled, that leads to the stark differences between our findings and past research.”

“If You Pay Peanuts Do You Get Monkeys? A Cross-Country Analysis of Teacher Pay and Pupil Performance”
Dolton, Peter; Marcenaro-Gutierrez, Oscar D. Economic Policy, January 2011, Vol. 26, No. 65, 5-55. doi: 10.1111/j.1468-0327.2010.00257.x.

Abstract: “This paper considers the determinants of teachers’ salaries across countries and examines the relationship between the real (and relative) level of teacher remuneration and the (internationally) comparable measured performance of secondary school pupils. We use aggregate panel data on 39 countries published by the OECD to model this association. Our results suggest that recruiting higher-ability individuals into teaching and permitting scope for quicker salary advancement will have a positive effect on pupil outcomes.”

“Skills, Productivity and the Evaluation of Teacher Performance”
Sass, Tim R.; Harris, Doug. Andrew Young School of Policy Studies Research Paper Series, No. 12-11,March 2012.

Abstract: “We examine the measurement and prediction of worker productivity using a sample of teachers and school principals. We find that principals’ evaluations are positively associated with teachers’ estimated contributions to students’ test scores (value-added), and are better predictors of teacher value-added than are teacher credentials. Principals’ assessments of teachers’ cognitive and non-cognitive skills are strongly associated with principals’ overall teacher evaluations and to a lesser extent with teacher value-added. While past teacher value-added predicts future value-added, principals’ subjective ratings can provide additional information, particularly when prior value-added measures are based on a single year of teacher performance.”

“Can Districts Keep Good Teachers in the Schools that Need Them Most?”
Guarino, Cassandra M.; Brown, Abigail B.; Wyse, Adam E. Economics of Education Review, October 2011, Vol. 30, No. 5, 962-979. doi:10.1016/j.econedurev.2011.04.001.

Abstract: “This study investigates how school demographics and their interactions with policies affect the mobility behaviors of public school teachers with various human capital characteristics. Using data from North Carolina from 1995 to 2006, it finds that teachers’ career stage and human capital investments dominate their decisions to leave public school teaching and school demographic characteristics play a dominant role in intra-system sorting. Schools serving at-risk children struggle to attract and retain teachers with desirable observable characteristics. We find evidence to suggest that across-the-board school-based pay-for-performance policies have small but significant associations with mobility decisions and appear to exacerbate inequities in the distribution of teacher qualifications.”

“Teacher Employment Patterns and Student Results in Charlotte Mecklenburg Schools”
Strategic Data Project, Center for Education Policy Research, Harvard University, February 2010.

Findings: “In 2008-2009, 16% of newly hired teachers have a hire date after the school year begins. Late hires are more likely to be novices than experienced teachers. Late hiring spikes in January. Late hires perform less well than those hired before the school year starts. [That] trend persists for years after initial hiring. The late hire performance gap is about a third of the magnitude of a 10 student kindergarten class size reduction…. 1st, 2nd, and 3rd year teachers have students with significantly lower prior math performance than teachers with six or more years of teaching experience.”

“How Should School Districts Shape Teacher Salary Schedules? Linking School Performance to Pay Structure in Traditional Compensation Schemes”
Grissom, James. A; Strunk, Katharine O. Educational Policy, August 2012, Vol. 26, No. 5, 663-695. doi: 10.1177/0895904811417583.

Abstract: “This study examines the relative distribution of salary schedule returns to experience for beginning and veteran teachers. We argue that districts are likely to benefit from structuring salary schedules with greater experience returns early in the teaching career. To test this hypothesis, we match salary data to school-level student performance data on math and reading achievement tests across states. We find that frontloaded compensation schemes — those that allocate greater salary returns to experience to novice teachers — are associated with better performance in multiple grades and throughout the achievement distribution.”

Tags: children, youth, research roundup, teachers, organized labor