Is 'Teacher Accountability' Ready for Prime-Time?
Continued from previous page
But although the teacher results were correlated, they were only weakly correlated. True, more teachers who had high value-added scores on a basic skills test also had high value-added scores on a test of reasoning, but it wasn’t many more. If you fired teachers who did poorly at teaching basic skills you would get rid of many teachers who did poorly at developing reasoning skills, but you would also get rid of many teachers who did well at developing reasoning skills. The first group (those who did poorly) would be larger than the second group (those who did well), but not much larger.
The second highly publicized study, done by a group of Harvard researchers, concluded that teachers whose students had high value-added test scores were also those whose students had better long term adult outcomes—better earnings, for example. This was a potentially important finding because it suggested that these tests had not become ends in themselves, but rather that success for students on these tests made the students more likely to be successful as adults, and if you put pressure on teachers to increase their students’ test scores you would also be putting pressure on these teachers to improve their students’ adult success. And that would be a good thing.
The flaw here is that the researchers were unable to compare the long term results of high value-added teachers with results of teachers who excelled in other ways that might, conceivably, have even larger impacts on long term outcomes. For example, the researchers could not say whether teachers who are more effective at developing their students’ cooperative behavior, or reasoning skills (and we know from the Gates study that only sometimes are these the same teachers who are more effective at teaching basic skills) might have students who have even better adult outcomes—like earnings. If this were the case (and we have no reason to believe it one way or the other), then getting teachers to shift their attention from teaching reasoning or cooperative behavior to standardized test preparation might be lowering their students’ future earnings, not raising them.
In short, the two recent studies most heavily promoted by supporters of the Chicago district’s plan to evaluate teachers in part by their students’ test scores do not confirm that the district’s position is wise. It may be, but it also may do great harm.
The Chicago district, and other promoters of teacher evaluation based in large part on student test scores, have become aware of these problems. And so they now emphasize that they support evaluating teachers by “multiple measures”—not only their students’ test scores but by the performance by students of assigned tasks under the supervision of experts, by observation of teachers by their principals, and sometimes (for high school students, for example) by student reports of teacher effectiveness.
This is a fine balanced approach in theory, but is very difficult to implement in practice. For example, when the Gates Foundation study also showed a correlation between a teacher’s value-added test scores and a rated observation by instructional experts, it conducted this experiment by providing the experts with videotapes of teachers conducting instruction. The experts watching (and evaluating) the videotapes did not know the value-added scores of the teachers on the tapes, so the two measures (value-added scores and expert observation) were independent. But according to the Chicago district proposal, the observations will be conducted by principals, who will know the value-added scores of teachers they are observing. How principals will be influenced by this knowledge cannot be known—will they tend to give high ratings to teachers with high value-added scores in order not to call attention to possible flaws in their observational skills, will they tend to offset value-added conclusions in order to save favored teachers who have low value-added, or will they tend to sink unfavored teachers with high value-added? One thing of which we can be certain: Armed with knowledge of teacher value-added scores, it will be much harder for principals to observe and evaluate teachers objectively. In times past, when student test scores did not have high stakes for schools or teachers, principals with knowledge of test results could use this knowledge constructively to guide their observations; principals would visit classrooms where test scores were poor to see if they could determine something being done poorly, or visit classrooms where test scores were good to see if they could learn what was being done right. With high stakes now attached to the test, such constructive evaluation is less likely.