How New York City's Flawed Data Fuels the Right's War on Teachers
We know that the flawed New York City teacher ratings data that was released last week is broadly unreliable, with a high margin of error and rating teachers not just by how well their students did on tests, but by how well their students did in relation to an algorithm's sometimes unreasonable expectations of how the students would perform. That's information that's widely known and indisputable. Yet the teacher rankings are still being used to publicly vilify low-rated teachers without even a cursory investigation of whether the data's many known flaws may have made a good teacher look bad.
Value-added models such as New York is using to predict what a teacher's results should be, based on the characteristics of the students and other variables, are somewhat worse than unproven at this point. The factors in schools are too complicated, the tests have too much measurement error, and the conclusions reached by multiple studies, including from RAND and the Educational Testing Service, are that value-added models are "too imprecise" and "should not serve as the sole or principal basis for making consequential decisions about teachers." To its credit, New York isn't using the Teacher Data Reports currently in the news as the sole basis for its ultimate teacher evaluations. But despite the much more sophisticated evaluation process that has been painstakingly negotiated, and despite earlier assurances that "It is DOE's firm position and expectation that Teacher Data Reports will not and should not be disclosed or shared outside the school community" and that "In the event a [freedom of information] request for such documents is made, we will work with the [United Federation of Teachers] to craft the best legal arguments available to the effect that such documents fall within an exemption from disclosure," the city's Department of Education actually encouraged and facilitated the release of this unsophisticated, unreliable data.
Reports are pouring in of anomalies that show how many straightforward errors exist in the Teacher Data Reports. The principal at P.S. 321 writes that one year of data for two grades has errors on four to six of her teachers, including that "One teacher who taught in 08-09 but was on child care leave for years before that time has data for a previous year—impossible ... it must be data from someone who was in that same room the previous year" and "a teacher who has taught 4th grade for 5 years has no data for previous years." Diane Ravitch cites a report of a teacher who "received only two rankings, 88 percent in one year, and 38 percent in the next, yet his rating was averaged as 40 percent." (Never mind what it says about the test's reliability that a teacher could go from 88 percent to 38 percent year to year.)
In other cases, several different flaws in the data or anomalous circumstances came together to victimize specific teachers. Take Pascale Mauclair, who received a poor ranking. Reporters from Rupert Murdoch's New York Post went to her father's home, knocked on her door for so long that she had to call the police twice, and questioned her neighbors about her. The Post printed her salary and her picture. But as Edwize, a UFT blog, explains, there is not just one reason to believe Mauclair's low ranking was wrong, there's a host of them.
First off, P.S. 11, where she teaches, is a very good school as measured by ranking systems and by its popularity within its community and support from students and parents, and Mauclair is respected within the school. Yet somehow, of the school's seven sixth grade teachers, three were ranked at the zero percentile by the city's Teacher Data Reports. How do you have a very strong school if nearly half of the students in one grade are being taught by the worst teachers in the city? Might that not be a red flag that either the school is being rated better than it is, or those teachers are being rated worse? And since the school is rated well by at least two different reports and by its students and parents and it's so popular it operates over capacity, the logical explanation for the discrepancy is that there's something wrong with the teacher rankings, which are based on one data source already known to be flawed.
In fact, P.S. 11 is one of the few schools in the city where sixth grade is taught in an elementary school, rather than a middle school. That means that, unlike middle school teachers who teach one subject and teach up to 160 students through the day, these teachers teach no more than 32 students in multiple subjects. It's not just an apples to oranges comparison, it's an incredibly low sample size, which makes the data weaker.
So the story of how outrageous the New York Post's crusade against Pascale Mauclair as one of the city's worst teachers starts with school-level data that shows right off the bat that even over and above the general problems with the teacher data, there's something wrong in this specific case. Add to that Mauclair's own position as an ESL teacher, who teaches:
...small, self-contained classes of recently arrived immigrants who do not speak English. Her students arrive at different times of the school year, depending upon that date of their family’s migration; consequently, it is not unusual for her students to take the 6th grade exams when they have only been in her class for a matter of a few months.
In fact, between P.S. 11 being an elementary, not middle, school and Mauclair being an ESL teacher, she teaches so few students that she didn't even meet the minimum number of students to have a score reported for English Language Arts, and was only given a score for math, which was based on just 11 students.
All of this might be too complex for the New York Post even if it wasn't on an ideological crusade against teachers and their unions, and using Mauclair as a scapegoat in that crusade. But when you take together the known problems with value-added modeling at all, the fact that the city's actual teacher evaluation process is much more complex than one test and that this test was not supposed to be made public, the giant margin of error on the results, and the host of known errors like a teacher being attributed test results for years spent on parental leave, it's clear no reasonable person in possession of the facts could put any weight on these tests as measure of teacher effectiveness. Add to that the way the case of P.S. 11 and Pascale Mauclair shows how multiple factors in the way the data is calculated can completely distort the picture and you see how this is not only a tragedy for public education, it's also many personal tragedies visited upon undeserving teachers. Doubtless there are some bad—even terrible—teachers in the New York City schools. But the Teacher Data Reports are not the way to uncover who they are, and intense harassment and public vilification from a Rupert Murdoch newspaper are not the way to treat even bad teachers.