A Teacher’s Troubling Account of Giving a 106-Question Standardized Test to 11 Year Olds

Is our obsession with testing doing real damage to our kids? Take a peek into this classroom and decide for yourself.

Photo Credit: Constantine Pankin |

We’re just a month into school and already the testing madness has begun. Many Pittsburgh Public School students have just taken their first round of standardized tests, and it’s time to ask some serious questions about their purpose, the ever-increasing number of tests, and the impact on our children.

Let’s start with this troubling account from a middle school language arts teacher, who gave the “GRADE” reading test to her students on Friday. This is a diagnostic assessment designed by the education corporate giant, Pearson, and the district is using grant money to pay for it. (More on both of these points after you have read this teacher’s story).


Say you’re a teacher with a diverse and exciting group of students who have found learning together an exciting prospect. You have had ups and downs, but each day has ended with more students feeling positive about their ability to learn, and each day investing more in the process. Then, a couple of weeks into the school year, you have to make the first stop in this process. The first Pearson-created standardized test has landed on your desk. Teaching/learning has to stop. You hide your face from your students as you grit your teeth. You tell them, as always, not to worry. You tell them no one expects them to get all of the answers right, but you do expect them to do their very best. You know they will, as they will want to show everyone how smart they are, just as they’ve shown you in so many ways. But inside you cringe...

You stand in front of the class and read a sentence to the children. You are allowed to repeat the sentence only once. Then the students select one of four pictures that they think most reflects what the sentence says. The children look determined; they are ready; you begin.

The first question seems harmless enough. The students look ok. Then you get to the second question. Of 106 questions. The sentence you read says something like, “Luis draws a blank when he is asked to solve a math problem on the board.” The students have four drawings to choose from. In the second drawing, a student is drawing the kind of blank one would see on a paper on which students are directed to “fill in the blanks.” It is a blank. He is drawing it.

You start to feel stomach pangs as you look around the room at eleven-year-olds, many of whom come from non-English-speaking families, or families for whom this type of idiomatic expression is not common, and you realize that you have never come across this expression in any of the literature you have taught students over the years. You know it is unlikely that many of these children will recognize the puzzled expression on the face in one of these pictures as the “right” answer. For many of these kids, “blanks” have to do with guns.

But you go on, and hope this is an exception, this bad question. Then there is Question #4: “Roger told her that he would have changed the oil himself in a couple of weeks.” What? Changed what oil, you think, as you look around at your class of children for whom having a car in the family is far from a given. Children for whom having a parent who changes oil in a car is even less likely. Then you look at the drawings, and you see why the children are looking confused – but still trying hard.

All four drawings include a car. Three of them include a man doing something under a car. Two of them include a girl sitting in a wheel chair watching the man (this, perhaps, makes this test culturally responsible?). One of them shows a car driving down the road with smoke coming from under the hood. I – a long-time car owner and driver and oil-change customer — had to look at this set of pictures several times, over a couple of days, to figure out which answer they were looking for. (Really, if you waited a couple of weeks to change the oil, wouldn’t it be possible that the engine would smoke?) But the children had only a short period of time to figure out – or guess – what the answer was.

By now my students were getting a bit restless. The confidence with which they had gone into this testing situation was beginning to dispel. Just a bit. There were still 102 questions left to answer.

We went on. Question number six referred to “a pair of drumsticks” and included as choices a boy eating two chicken-type drumsticks along with others of the musical kind. This is almost funny, but the students are supposed to choose the “right” answer. Number seven brought my stomach pangs back. The expression in this question was “brushed up on art history.” “Brushed up.” The first choice showed a man with a paint brush and an easel – the only one of the pictures clearly about art. The “correct” choice was a man looking through a stack of books, one of which had a tiny, crude and hard-to-see drawing of a female which one could interpret as the Mona Lisa, if one were familiar with her. I began to wonder, what was Pearson, this test maker, doing to our children?

Question number eight had two possible answers, each of which was equally justifiable. Oh, but our students never would have the chance to justify their answers on this type of test. Take that, you kids who are daring to think.

Question twelve put me over the top. But I continued my outward calm, even as I watched the kids squirm, and as some began to lose their focus and their positive demeanor. The mumbling had begun. The sentence I read to the class said something like “she realized she could store her belongings in the bureau.” “Bureau.” There were four pictures to choose from. One was a building that looked like a public “bureau” of the government to me, but I doubted my students would think of that. One was of a tractor. Scratch that. But I looked at my students whose families speak Spanish at home. And I looked at the burro in picture “C.”

Then I looked at the picture of what my family calls a chest-of-drawers. And I thought about how we have never used that word, “bureau,” for a piece of furniture. And I have never heard that word in the homes of my students’ families. And I thought, how crude, how cruel, how ignorant, how disrespectful of these children. What a set-up. Who would do that to kids?

Question 16 was . . . well, you decide. The sentence is something like “Carl approached a friendly wave as he walked onto the beach.” You guessed it. Only worse. One drawing has boys looking at what looks to me like a very friendly wave of water. Another has a boy walked toward two boys, one of whom gives a friendly hand wave. Another has two boys walking toward one boy who gives a friendly wave. Who is Carl? No one has told the student. And the last one has a boy riding a friendly-looking wave with a smile on his face. Wha . . .?

After the 17th “listening comprehension” question, the students went on to the rest of the 106 questions on their own. They still wanted to do well. Some, however, had already given up. Among them were the tell-take signs of anger and frustration (broken pencil; slumped back in the chair; head down on the table; making eye contact across the room with another student and laughing; calling out “this is stupid!” – and other indications of labeling themselves as “stupid”). The work to build that community of self-confident learners had been undercut.

But the test went on. And the next section had students doing something all teachers know does not make sense. They were trying to guess among five choices the meaning of a word all by itself, out of context. This section was called “Vocabulary.” The words included such certain-to-be-missed-by-most-students words such as “whimsical,” “supple,” “guile,” “resplendent,” “broach,” and on and on.

By the end of the Vocabulary section these children had been through 57 of the 106 questions. They were more than half way done. But the double period was almost over. They were about to go home, having entered the classroom feeling strong and ready to learn, about to leave feeling, in their words, “stupid.” They had lost two full periods of real teaching/learning. What had they gained? Really, what? It was Friday. I would not be with them again for more than two days. I could not ease them back into knowing that they were smart and making progress.

Like them, I left for the weekend feeling defeated. What happens when our beautiful children face this kind of situation over, and over, and over again? The phrase, “first do no harm,” consumed me. I was leaving school for the weekend on the wrong side of that admonition.

Isn’t it time to stop this ever-increasing testing cabal, which puts our children, and their enthusiastic and devoted teachers, into these untenable situations? Can we remain compliant when our children and our teachers are judged by performance on such abominations parading (and being paid for) as “assessments?” Is this how we want our children, and our teachers, to spend the precious hours they have together in our schools? When does this situation become untenable enough for us to stand up, together, on their behalf?


This teacher asks critical questions that we should all be trying to answer. This isn’t a rhetorical exercise. Really – when do we make it stop? To her list, I add the following for consideration:

What is the purpose of these tests? Some assessments such as the GRADE test are meant to be diagnostic tools, to help teachers figure out where students are in their learning. But if giving poorly designed tests actually interferes with students’ learning process, and takes away from actual instructional time in the classroom, are they helping or hurting overall teaching? If tests are poorly designed, are they really effective as diagnostic tools? Even if such tests are well designed, are they providing the kind of information that our teachers need to shape learning?

Are they culturally biased? A 2002 review of the research literature concluded that the GRADE assessment is developmentally appropriate, reliable, and valid. That’s reassuring, though this teacher’s personal experience would seem to challenge these findings and I would love to learn more from our educational research colleagues out there. However, that same study found that there was “no evidence” that the GRADE test was “sensitive and appropriate for differing cultures and needs.” [Collaborative Center for Literacy Development] That was in 2002, eleven years ago, and seems to be true still today. How long does Pearson need to correct the obvious cultural biases in its tests?

Are they useful for teaching and learning? Pittsburgh parent Pam Harbin started looking into the GRADE assessment last year when it was introduced into the district and discovered that students do not have the opportunity to review and learn the material they got wrong. “For too long we have taken for granted that the tests our kids are taking are for their benefit,” Pam says. “I’m really having a hard time understanding why the District is requiring so many assessments where kids don’t have a chance to learn from their mistakes. … It doesn’t make any sense to test kids in this way.” Whether it’s formal District policy or not, it appears that many schools are working under the belief that teachers are not permitted to discuss anything on the GRADE test, in particular, with students before or after giving it. On other tests, such as the PSSAs and Keystones, teachers are explicitly forbidden to see the actual test questions or provide feedback to the students.

How has the frequency and quantity of testing increased? The GRADE test is given three different times during the year. That alone might not sound so bad, but consider that the District is now giving up to 17 different standardized tests to students each year, depending on grade level, and many of them are given more than once a year. For instance, my 7th grader will take 21 standardized tests this year. [PPS 2013-14 Assessment Calendar]

Does testing reduce learning opportunities? All of that test-taking robs students of real learning time. This teacher reports that her students lost four class periods alone taking the GRADE. Even worse, her students are about to take another round of tests, the CDT’s, which are given on computers. Because the school’s classes are too large to fit in the computer lab, and there are so many classes that need to schedule testing, the lab won’t be available for anything other than testing for quite some time. She explains, “My students [in another class] need that lab to do the research that is a part of our curriculum and can’t be locked out during this period.” She also worries about “giving this test to three of my classes, losing yet more instructional time for yet another non-curriculum-based test.”

How can testing harm students? For some students, taking a test such as GRADE is a minor annoyance. For others, it can leave them feeling “stupid,” frustrated, and ready to give up on learning. This seems particularly cruel, as this teacher points out, when this is “due in large part to the errors and problems with the test. Students do not need more of that in their lives.” Yet one reason districts might hold on to tests such as GRADE is that they can help to demonstrate “student growth” to state officials, sometimes more accurately than the PSSA results. But this is a misuse of student testing – ostensibly designed to help individual students – to evaluate schools and districts. This is yet another way in which the culture of high-stakes testing is hurting our kids.

How can testing harm teachers? We know that some tests such as the PSSAs and Keystones have very high-stakes attached to them. [See “The VAM Sham”] But even these lower-stakes tests can harm teachers, as this teacher points out: “Giving the test makes teachers feel like they are abusing their children. We do not need more of this in our lives.”

Do we have to? The district is using grant money to pay for the GRADE test (which was a requirement of the grant) along with professional development for teachers and other worthy things. But what if we refused to accept grants with such strings attached? Imagine if we could use those dollars now going to line the pockets of the international corporate giant, Pearson to buy drumsticks for the Westinghouse Bulldogs marching band or books for the Pittsburgh Manchester K-8 library? Pennsylvania is spending hundreds of millions of taxpayer dollars to develop more high-stakes tests for students, and requiring local districts to spend hundreds of millions on top of that to get their students ready to take them. [Tribune Review, 6-2-13] (And guess who makes all the test prep materials?) What if we stopped this upward spiral of testing madness and focused on what actually helps students learn?

Jessie B. Ramey is the ACLS New Faculty Fellow in Women's Studies and History at the University of Pittsburgh, and the author of Child Care in Black and White: Working Parents and the History of Orphanages (University of Illinois Press, 2012). She is the founder of the public education advocacy Web site Yinzercation.

