Testing...Testing...One, Two, Three

Quick: Name a standardized testing company. Un-less you are a teacher, principal, or state educa-tion official, chances are the one name you’ll come up with is the Educational Testing Service (ETS), purveyor of the SAT, the GRE, and an alphabet soup of other post-secondary admissions tests.

But if you’re a public-school student in the United States -- spending an increasing number of hours hunched over a "bubble sheet" with #2 pencil in hand -- your test will likely come from one of a handful of companies in the nearly invisible K-12 standardized testing market. Over 40 states now mandate standardized testing of public-school students, usually in multiple grades, and the Bush administration’s education plan calls for testing all public-school students nationwide in grades three to eight. Increasingly, test scores are being used for high-stakes decisions -- such as whether students get promoted to the next grade or graduate from high school, or whether teachers and principals get bonuses or keep their jobs.

The current "test-heavy" model of education reform rep-resents the growth of corporate influence on the schools. In many states, business leaders have formed coalitions for the express purpose of reshaping public schools. As John H. Stevens, executive director of the Texas Business and Educa-tion Coalition, said at a July 2000 education reform confer-ence, "[E]ducators do not dominate the dialogue on educa-tion in Texas. For more than a decade, the business community and a group of key legislative leaders ... have been the major players in shaping state education policy."

These business-led coalitions uniformly advocate more -- and higher-stakes -- standardized tests. In many states, the business coalitions are pressing for increased public educa-tion spending; in their view, more testing and stricter ac-countability systems are merely tradeoffs that educators must make in return for new monies. Of course, the business agenda for education reform does not end there. In many instances, business leaders also hope to weaken teachers’ unions and spur privatization.

But more generally, business leaders and management gurus have been very vocal about the need to apply business-based management techniques to the schools. And in the business model, the need for a lot of testing is obvious. After all, what business could function without "quick and con-stant measurement of output," as a 1999 Forbes article on “quality control” in the schools put it?

As a result of these initiatives, America’s schoolchildren are now being subjected to more and more standardized test-ing. As the stakes grow higher, we need to turn a spotlight on the K-12 testing industry—and how industry trends are affecting the creation and marketing of the tests themselves.

That's Edutainment

In recent years, the creation and scoring of K-12 tests has become big business. Between 1955 and 1970, sales of stan-dardized tests grew gradually, from $5 million to $25 million (both in 1988 dollars). Since then, sales have increased dra-matically, reaching $130 million in 1990 and jumping to $234 million in 2000. Even these figures understate the in-dustry’s size. A 1993 Boston College study estimates that the K-12 standardized testing industry may be four to six times larger—perhaps as much as $1.5 billion a year—when scor-ing and score reporting are included, along with customized tests produced under contract with individual states.

One of the largest corporate players in the testing market is a firm that, for much of its history, had only a tangential relationship to education. Capital-izing on new scanning technology developed at the University of Iowa, National Computer Systems (NCS) was founded as a data process-ing firm in 1962. Over the next 20 years, NCS became the nation’s largest scorer of standard-ized tests, although it maintained a substantial data-processing business outside of education. NCS began selling some guidance and counsel-ing tests in the mid-1980s, then rapidly became a leader in the overall school testing market, in part by winning large contracts in several states that were adopting customized testing programs. NCS’s revenues have grown dramatically, from $35 million in 1980 to nearly $630 million in 1999.

Now, in line with industry trends, NCS has been bought out by a multinational. In 2000, Texas renewed NCS’s con-tract to operate its statewide testing program, a contract worth $233 million over five years. A few months later, Pear-son plc, a British media conglomerate, purchased NCS for $2.5 billion. Pearson, like NCS, did not start out in the edu-cation business. It owns The Financial Times, The Econo-mist, Penguin Books, Simon & Schuster, and some television production companies; until 2000, it also owned about half of Lazard Bros. U.K. Bank. But with the purchase of NCS, Pearson Education—with about $3 billion in sales in 2000—now accounts for over half of Pearson’s overall rev-enues.

Along with NCS, three other firms dominate the K-12 test market: Harcourt Educational Measurement, CTB/ McGraw-Hill, and Riverside Publishing. (Non-profit ETS is also trying to carve out a share of this market with a new for-profit subsidiary, K-12 Works.) Unlike NCS, these firms at least have long histories in educational publishing. Like NCS, however, they have been subject to the vicissitudes of multinational merger-and-acquisition mania. Harcourt Gen-eral, the latest incarnation of the eminent publishing house Harcourt, Brace, Jovanovich since its 1991 acquisition by General Cinemas, was purchased by British-Dutch scientific publisher Reed Elsevier last July. Around the same time, Riv-erside’s parent company, Houghton Mifflin, was acquired by Vivendi Universal SA, a French media company whose web-site hails the purchase as an important addition to its “edutainment” business.

Risky Business

As test publishers strive to accommodate the needs of their new owners, they will no doubt be paying closer attention to the bottom line. One impact of the changes in industry structure—both the recent mega-mergers and an earlier round of mergers and acquisitions—is already clear: how, and how well, tests are authored. Most of the classic K-12 tests, such as the Stanford Achievement Test and the Iowa Tests of Basic Skills, were created under university auspices and written by scholars of psychometrics (the study of educa-tional and psychological assessment) and education. Today, test authors are more typically anonymous publishing com-pany employees. According to Professor Walter Haney, who co-authored the 1993 Boston College study, this change is probably responsible for “many more errors, in the tests themselves and in their scoring.”

Indeed, test and scoring errors have become practically routine. A New York Times analysis in the spring of 2001 found that, in 16 states, testing contractors had made signifi-cant errors in scoring or results analysis. In 1999, scoring errors by CTB/McGraw Hill affected schools across the country: In New York City, 9,000 students were mistakenly ordered to go to summer school, and principals and district superintendents across the city—along with Schools Chan-cellor Rudy Crew—lost their jobs; in Nevada, elementary schools were mistakenly labeled “inadequate.” In the spring of 2000, thanks to scoring errors by NCS, a number of Min-nesota high-school seniors had their diplomas withheld. And last year in Massachusetts, where Harcourt has run the state-wide testing program since 2000, students themselves found errors in several questions on the high-stakes Massachusetts Comprehensive Assessment System (MCAS) tests.

The industry’s changing structure, along with new mar-keting practices, has also compromised test publishers’ com-mitment to complying with ethical standards in the field. The Code of Fair Testing Practices, created by major professional organizations in education and psychology, spells out the responsibilities of both those who publish and use standard-ized tests. For instance, the code prohibits test users from misusing test results, and also requires test publishers to warn their customers against such practices. While the code does not identify specific examples, educators and psycho-metricians generally agree that using a single test score to make a high-stakes determination—such as high-school graduation—represents misuse. But as a business-based model of accountability comes to dominate education re-form, a number of states are beginning to do just that.

In this new climate, test publishers are leaving their ethical responsibilities behind. In the past, when achievement tests were purchased by thousands of individual school districts, test publishers could reasonably claim that they could not police how each district was using test results. Today, testing companies are signing large, multiyear contracts with states to create customized statewide tests, and sometimes the companies are well aware from the start that a state is plan-ning to misuse test results. But instead of refusing to partici-pate in the process, most publishers are willing to put fair-practice standards aside. (An interesting exception occurred in 1987, when ETS—concerned that Texas planned to fire teachers who did not pass a new recertification test—de-clined to bid for the contract to develop it. Of course, other firms stepped in and bid, and Texas does fire teachers on the basis of the resulting test.)

For test publishers, the fair-testing protocol can be a sensi-tive public relations problem. Last May, for example, Eugene Paslov, president of Harcourt Educational Measurement, told a group of Massachusetts high-school students, “When these tests are used exclusively for graduation, I think that’s wrong.” A day later he backtracked, claiming that the high-stakes MCAS provides “multiple measures” of student achievement since students are allowed to retake it—a state-ment many education experts found laughable. Likewise, Michael H. Kean, vice president for public and government relations at CTB, told Education Week in March 2000 that “high-stakes decisions should not be made on a single mea-sure.” In interviews, both Kean and a spokesperson for NCS claimed that, in several instances, their companies have de-cided not to compete for particular testing contracts because of concerns about standards. However, neither was willing to give specifics. Time will tell whether upper-level managers from Pearson or Vivendi will allow psychometricians in their employ to stick to their scruples—especially if it means miss-ing out on lucrative contracts.

Since the testing industry’s actions can have a profound impact on millions of students and educators, it seems rea-sonable to expect government regulation. But for now, fed-eral regulation is nonexistent. As the 1993 Boston College study points out, “While our society requires product warn-ing labels on…personal deodorants and food coloring, no warning labels are federally required on test instruments that may determine whether someone gains employment or is classified as mentally retarded.”

At the state level, efforts to protect the rights of those who purchase and take standardized tests have been largely unsuccessful. Back in the 1970s, a number of states considered “truth-in-testing” legislation, which would have re-quired companies to release test questions and provide information about test development and scoring—but only New York passed a law. Even then, it applied only to post-secondary ad-missions tests, not K-12 achievement tests, and test pub-lishers have slowed its implementation with lengthy lawsuits. Now that many states have made standardized testing the centerpiece of their education reform programs, it’s probably even less likely that any new state regulations will be passed. Ironically, the current move toward high-stakes testing has forced the industry itself (and its client states and school districts) into the courts. In several states, students who have been held back or denied high-school diplomas are suing, and CTB’s Kean emphasized that the company puts great stress on “legal defensibility” in its conversations with state education departments and other clients.

Although the “Big Four” still dominate the K-12 testing market, ballooning revenues have attracted plenty of small start-ups. Some have been created by former schoolteach-ers— dissatisfied with their salaries, perhaps?—while others are the work of MBA marketing types with no apparent ties to education. Consider Advantage Learning Systems (ALS), created in 1986 to market computerized reading comprehen-sion tests matched to popular children’s books. (These tests aren’t marketed as achievement tests, but in today’s test-crazy climate, who knows?) So far, the plan seems to be working beautifully. Forbes gushes that ALS’s “handsome [profit] margin—and [its] 61% annual growth rate—makes it a favorite on Wall Street, with a stunning $1.3 billion mar-ket capitalization.”

But despite their sales successes, some of these start-ups— like their larger competitors—have had serious prob-lems with quality control. Measurement Inc., a Florida-based test scoring company, guaranteed a 99% accuracy rate for scoring the essays that are now common on state achieve-ment tests. Suspicious of that claim, Boston College’s Haney analyzed the scoring protocol and says the accuracy rate is closer to 70%, a fact that the company ultimately acknowl-edged.

Naturally, the current test mania has spawned a huge de-mand for test-prep materials. Although estimates of its size are hard to come by, this market appears to be grow-ing too—probably as rapidly as the stakes attached to the tests. Bookstore shelves that for years have been weighed down with volumes claiming to raise scores on the SAT, ACT, GRE, GMAT, or LSAT are now also laden with test-prep books for the younger set, down to and including five-year-olds. Other test-prep companies market their workbooks—with titles ranging from No Stress to Buckle Down—directly to schools. Test-prep consultants also sell their services to schools, where they plan test-week pep rallies and teach strategies such as never marking the same letter on three multiple-choice questions in a row.

As with test publishers, the scramble to boost revenues sometimes leads test-prep companies to violate ethical stan-dards. For example, Buckle Down, Inc. sells customized test-prep workbooks—for every test at every grade—in states that have developed their own statewide standardized tests. That sounds great, except that the validity, such as it is, of any standardized test is compromised when students use preparation materials that are virtually the same as the test itself. For example, California Department of Education guidelines prohibit the use of test-prep materials written for a specific test. Buckle Down’s California marketing blurb acknowledges this policy, assures customers that it’s in com-pliance— and then, in the same paragraph, touts how closely its materials are pegged to the state’s tests!

Who's Picking Up The Tab?

Where is the money for all of these new tests and test-related products coming from? Parents and schools are paying for the test-prep materials, sometimes at the cost of other books and supplies. One teacher in Texas complained that practi-cally all of her school’s materials budget is now going to test-prep books.

As for the tests themselves, state governments are increas-ingly footing the bill. State spending on testing has grown dramatically in recent years: According to two studies cited in a 2001 report by the Education Commission of the States, the total for 2001 was around $400 million. Consider Texas, one of the first states to mandate statewide testing (thanks to Ross Perot) in 1980. In 1990, Texas shifted to a more extensive testing program called the Texas Assessment of Academic Skills (TAAS); now, all students in grades three through eight, plus grade ten, take some TAAS tests. Texas is also implementing standardized end-of-course tests for some high-school classes, including biology, algebra, and U.S. history. (Imagine how that test will limit a U.S. history teacher’s curriculum choices!) From third grade on, all sec-ond language speakers must take a reading proficiency test every year until they pass. There are alternative versions of the TAAS for special-education students. And on and on. Texas state spending on testing has risen from $19.5 million in fiscal year 1995 to $68.6 million in fiscal year 2001. (Sur-prisingly, Texas Education Agency officials were unable to provide figures prior to 1995.)

In Massachusetts, a state whose testing program has been widely hailed as a national model, spending on tests has also risen rapidly. The state launched a limited statewide testing program in 1990, and adopted the MCAS test in 1997. Pass-ing the MCAS will become a high-school graduation require-ment for the class of 2003, and MCAS scores are already being used to designate underperforming schools. The test was originally intended only for fourth, eighth, and tenth graders. But now, as in Texas, every student in grades three through eight, plus grade ten, takes some MCAS tests. Creat-ing, distributing, and scoring all of these tests has a high price tag. In the early 1990s, the state was spending between $500,000 and $750,000 a year on testing. This rose to $8.3 million in fiscal year 1998, $14.8 million in fiscal year 2000, and $23.2 million (projected) in fiscal year 2002.

Even now, states are spending only a small proportion of their overall education budgets on standardized tests—in the range of 0.5%. Just a few years ago, however, it was closer to 0.1% or 0.2%. Greater spending on assessment requires tradeoffs elsewhere in the education budget. In Texas, for example, between fiscal years 1995 and 2001, spending for adult education was cut by half (from $87.3 million to $40.4 million) and professional development dropped by nearly two-thirds (from $28 million to $9.8 million). In that same period, spending on tests more than tripled.

A number of other factors are pushing up testing costs, too. In response to longtime complaints about traditional standardized tests, which were made up entirely of multiple-choice questions, some states are moving toward “perfor-mance- based” tests. These tests include open-response ques-tions that require students to write out an answer, whether to a math problem or an essay prompt. Performance-based tests are not necessarily more expensive to develop, but they’re far more expensive to score.

The use of tests scores for high-stakes decisions is also costly, because it requires greater security measures and du-plicate scoring. Even the cost of defending against test-taker lawsuits can be attributed to high stakes—since no one would be suing over test or scoring errors if diplomas and jobs weren’t at risk.

Finally, the direct costs of developing, printing, and scor-ing a test are not its only costs. A number of researchers believe that, to weigh the costs and benefits of standardized tests fairly, total costs ought to include the time teachers and principals spend administering the tests, the time teachers spend conducting test-prep activities, and even the time stu-dents spend prepping for and taking the tests. Estimates of these indirect costs of standardized tests vary widely—any-where from 2• times to 60 times their direct costs.

The Truth About Testing

In any case, the real costs of more standardized testing, espe-cially the high-stakes variety, may be those that are harder to quantify: effects on classroom instruction, student and teacher morale, drop-out rates, and so on. A number of stud-ies, many of them commissioned by the Civil Rights Project at Harvard University, have documented negative impacts in all of these areas. Several studies have focused on Texas, where high-stakes testing has been in place the longest. There, drop-out rates among African-American and Latino students have risen since high-stakes testing began. There is even some evidence that students who pass the TAAS test and graduate actually demonstrate poorer writing skills when they arrive at college than did their peers a few years earlier, before high-stakes testing.

This last finding reflects the way high-stakes tests can compromise classroom instruction. Under pressure to pro-duce rapid increases in scores, some teachers are ditching their normal curricula in favor of week after week of isolated test-prep exercises. Students certainly don’t benefit when their teachers set aside quality literature—or even carefully prepared textbooks—in favor of short, out-of-context read-ing passages followed by strings of multiple-choice questions. But the pressure to collapse instruction into mere test prepa-ration can be intense, and it is no doubt greatest in schools with the lowest test scores, typically those serving low-income communities and communities of color. Testing ad-vocates claim that better, performance-based tests will put a stop to the “dumbing-down” phenomenon. But these tests are expensive; in fact, a few states that tried performance-based tests have already abandoned them because of the cost. As of 1999, only fourteen states were using any perfor-mance- based testing beyond a single writing sample.

These are the effects of high-stakes standardized testing that are invisible in the hard-data, number-crunching world of the business roundtables—including the national Business Roundtable—that are now setting the education-reform agenda.

Of course, this is not the first time that a business model has been applied to America’s public schools. In the early twentieth century, a great faith in the logic of the assembly line led many educators and politicians to reshape the schools in its image. Schools had to become much more ef-ficient, they reasoned, in order to assimilate millions of new immigrants, provide more years of schooling to each child, and prepare students for the world of work. To this factory model we owe today’s large schools with classes segregated by subject and students segregated by “ability.” Not coinci-dentally, that era also witnessed the introduction of stan-dardized testing on a large scale.

Now it’s clear that this model has failed. Research shows that small schools are more effective and that interdisciplin-ary approaches to subject matter and heterogeneous group-ing of students can enhance learning. However, today’s schools are stuck with a 100-year-old model of which hulk-ing, oversized buildings are only the most visible sign. A century from now, will educators look back just as ruefully on today’s business-based reforms?

Although the momentum behind the Bush education plan appeared to slow after September 11, the proposal was back on the House floor as of this writing (mid- December). If it passes, it will require a significant expansion of standardized testing. Even in states that already have mandatory testing, most don’t test every child in third through eighth grade, as the Bush plan mandates. This means the firms that supply and score most of these tests will have more work to do, and more money to make; it also means that their activities will have an even greater impact on the educational careers and life chances of millions of people. At a minimum, the indus-try should be under much closer scrutiny than it is today. Better yet, the United States needs to revisit the business model that places high-stakes standardized testing at the center of education reform.

Amy Gluckman is a visiting lecturer at Salem State College and a former member of the D&S collective.. To contact FairTest: The National Center for Fair & Open Testing, call (617)864-4810.

ACLU By ACLUSponsored

Imagine you've forgotten once again the difference between a gorilla and a chimpanzee, so you do a quick Google image search of “gorilla." But instead of finding images of adorable animals, photos of a Black couple pop up.

Is this just a glitch in the algorithm? Or, is Google an ad company, not an information company, that's replicating the discrimination of the world it operates in? How can this discrimination be addressed and who is accountable for it?

“These platforms are encoded with racism," says UCLA professor and best-selling author of Algorithms of Oppression, Dr. Safiya Noble. “The logic is racist and sexist because it would allow for these kinds of false, misleading, kinds of results to come to the fore…There are unfortunately thousands of examples now of harm that comes from algorithmic discrimination."

On At Liberty this week, Dr. Noble joined us to discuss what she calls “algorithmic oppression," and what needs to be done to end this kind of bias and dismantle systemic racism in software, predictive analytics, search platforms, surveillance systems, and other technologies.

What you can do:
Take the pledge: Systemic Equality Agenda
Sign up