When former District of Columbia Public Schools (DCPS) Chancellor Michelle Rhee instituted the new teacher evaluation IMPACT, school reformers and teachers’ unions alike rushed to comment on the controversial new initiative. The updated system known as IMPACT is described as a value-added teacher evaluation system that ties an educator’s employment and compensation to their students’ test scores.

Rhee described the previous teacher evaluation system as weak, in which “teachers were reviewed inconsistently and infrequently,” and pointed out that it would be irresponsible to ignore student growth in the new teacher evaluation metric “when we have the ability to measure it.” A slew of studies seemed to agree with her defense. A 2011 Aspen Institute report described relaxed teaching standards as the Achilles heel of public education and urged other districts to adopt IMPACT-like initiatives. A Stanford/UVA study later observed that IMPACT was successful in strengthening the teaching pool by encouraging the resignation of low-ranked educators and retention of high-ranked ones.

But the scarcely mentioned, uglier impact of IMPACT is disproportionately felt by teachers in DC’s poorest wards—at schools toward which minority teachers tend to gravitate. In seeking to improve the quality of teachers, IMPACT manages to simultaneously perpetuate stubborn workforce inequalities and exacerbate an already alarming shortage of teachers of color.

We know that poor schools are also more likely to be filled with novice teachers, or with teachers who are assigned to teach classes outside of their subject area. Rates of teacher attrition for these schools are elevated, and teachers often leave for higher income schools once they have gained some experience.

All this being said, there might not be too much to argue about if rigorous and carefully analyzed data could demonstrate that teachers in low-income schools were irrefutably inferior when compared to their counterparts in wealthier areas. But advocates for IMPACT often fail to acknowledge that schools built in areas of concentrated poverty deeply challenge teachers in ways that high- or middle-income schools do not.

A Misguided Effort; An Uneven Result

Teachers at low-income schools learn quickly that schools do not exist in a vacuum. These schools’ student bodies are full of kids dealing with the toxic stress of poverty, leaving many of them homeless, hungry, or sick due to limited access to quality healthcare. The students are more likely to have an incarcerated parent, to be deprived of fresh or healthy food, to have spotty or no internet access in their homes, or to live in housing where it is nearly impossible to find a quiet place to study. Low-income parents are more likely to engage in shift work or have lower levels of education; younger children tend to have smaller vocabularies before they even reach the classroom, and older students may have to work or watch younger siblings in order to support parents.

These constraints explain why teachers at disproportionately minority schools (who are more likely to be themselves minority) struggle to deliver the individualized attention and interventions that IMPACT requests. When rates of parent volunteerism and involvement are decreased, many students are left without their primary adult advocate and teachers are left with limited outside context to explain their students’ struggles, successes, and behaviors.

As expected, this plays out in the data.

The city of Washington, DC is divided by wards: Ward 3 is by far the wealthiest, housing only 23 percent low-income students; Ward 8 is in one of the poorest parts of the city, with 88 percent of its students considered low-income. During the 2013-2014 academic school year, half (yes, literally 50 percent) of the teachers in well-to-do Ward 3 earned a score of “highly effective”—the highest possible IMPACT rating, and the designation that qualifies teachers for salary increases. In comparison, only 19 percent of teachers in Ward 8 earned that distinction.

Consider this: while favorable analysis of this evaluation system congratulates IMPACT for increasing teacher caliber by either terminating or forcing out bad teachers, the studies are careful to only point out that teacher IMPACT scores rose—not to say that student performance improved. In short, IMPACT is proving to be a tool that is simultaneously punishing teachers who are most likely to help ameliorate the shortage of minority teachers.

While DCPS valiantly attempts to fix this by creating strong programs to improve the minority teacher pipeline, a recent study revealed that the real problem is retention of these teachers—who frequently cite the poor working conditions and low autonomy in lower income schools as the primary reason for their departure.

Opportunities for Bias

Most critical discussions about the source of IMPACT’s ratings disparities have understandably and accurately implicated the disadvantages in the work environments of high poverty schools. But rarely does the fairness of the second component—the Teaching and Learning Framework (TLF)—of IMPACT become a topic of debate. The TLF applies to all teachers, whereas the value-added measures can only apply to math and language arts teachers in certain grades due to test score availability.

Intended as a measure of instructional prowess, the TLF asks teachers to demonstrate a number of pedagogical techniques within a thirty minute lesson. (Throughout the year, newer teachers will have five classroom observations, while more experienced teachers can choose to reduce their number.) Observations are conducted by both school administrators and “master educators”, or “expert practitioners,” that travel from school to school, and who may or may not have knowledge of the teacher’s local communities or expertise in the subject area being taught.

Developed from sources that almost all support punitive evaluative systems, some TLF criteria could invite the possibility of bias into the observation. Although we know that education experts and high-quality instructors come in all races, none of the individual authors named as references or consultants in the creation of TLF standards are racial minorities—an absurd statistic given that 87 percent of DCPS students are Black or Hispanic.

Several aspects of the evaluation are highly dependent on modes of student, rather than teacher, communication. One of the more egregious examples of this is a provision that not only implores teachers to use “academic language” when explaining content, but also requires students to demonstrate verbally and through writing that “they are internalizing academic vocabulary.” This is a huge ask of teachers, especially given the deep language barriers that exist in many minority schools. While it might be pedagogically useful to employ a broad vocabulary in classrooms to encourage student growth, tying a teacher’s performance and pay to a student’s ability to absorb and adopt unfamiliar or uncommon language within a thirty minute observation session is a neither fair nor effective measure.

Teachers whose students interrupt or engage in “inappropriate or off task student behavior” are also downgraded in rating. Given what we know about racial discipline disparities and the skewed perceptions of “disruptive” or “dangerous” behavior across class and color lines, this criteria can perpetuate some very troubling patterns for both students and teachers.

IMPACT Is Choosing Convenience over Complexity

As far as teacher evaluation systems are concerned, IMPACT is, surprisingly, not the worst of them. IMPACT probably was instituted with the best of intentions, and despite the system’s shortcomings, DCPS has improved its content and application since it was introduced in 2009.

But ultimately, IMPACT does a better job at measuring the advantages of a teacher’s school environment than it does at measuring the advantages that a teacher can give his or her students. How much of the success of a DC teacher has to do with impressive ability and preparation, and how much of it can be attributed to having well-nourished, well-supported students in well-resourced schools? How much does a “highly effective” rating signal model classroom management skills and lesson plan design, and how much is based on teaching techniques and student responses that are most palatable to certain student profiles?

It’s sometimes difficult to measure all of the factors beyond income that make concentrated poverty so toxic: personal stress levels, parent involvement, access to resourceful spaces, limited mobility, peer effects, childhood exposure to vocabulary, arts, and literature. DCPS expects, indeed requires, teachers in low-income schools to combat all of these obstacles in their classrooms, but doesn’t account for these factors in their evaluation methods.

If DCPS wants to continue to evaluate its teachers based on technique and test scores, it should first identify and acknowledge the individual needs of each school environment. Secondly, DCPS should be on the front lines of the fight for policy reforms that will help ameliorate some of the root problems of disparity, like access to high quality childcare, affordable and integrated housing availability, criminal justice reform, and fair labor laws. These destructive out-of-classroom forces are the biggest roadblocks to student success, not the teachers struggling to combat them in the District’s toughest schools.