January 24, 2008

Surviving the Reading Assessment Paradox

By Cleveland, Laurie

THERE IS WISDOM IN THE WORDS, "WHAT GETS MEASURED GETS DONE." - SPELLINGS (2006) Over the last decade, assessments have dramatically increased, resulting in a paradoxical approach to literacy in which students spend significantly more time engaged in reading assessments while spending less time actually reading. Teachers are under great pressure to accommodate the regimen of assessment and documentation, fitting in more data collection and sacrificing free reading. Research suggests that this approach is flawed because the strongest indicator of reading ability is time spent reading (Allington, 2006; Krashen, 2004; McQuillan, 1998). Moreover, these new demands are squeezing library programs, notwithstanding studies on the positive impart of school library programs (Krashen, 2004; Lance, 2004; McQuillan, 1998).

School librarians face a serious challenge. They need to teach colleagues about the important role that the library can play in literacy, and they need to adapt their practice to discover new ways to play a constructive role in literacy education. Facing this challenge requires understanding the controversy dominating the reading field, as well as the political and social forces behind the controversy-especially when considering that current policy is a result of more than 20 years of political rhetoric about America's schools.


Current reading policy is the product of rhetoric about education. Reports and policies issued during the 1980s and 1990s have had a cumulative effect, contributing to mistrust in our educational system and an atmosphere in which increased testing is rationalized by notions of accountability and the virtues of science. The confused state of public perception about our schools is revealed in polling over the last decade. When asked to rate the quality of education, a majority of adults express dissatisfaction; when asked about their children's school or their local schools, they are overwhelmingly positive (Allington, 2006; Jones, 2005). The reading assessment paradox is a product of this confusion. As such, a good starting point for understanding our current dilemma is a government report issued in 1983, A Nation at Risk (National Commission on Excellence in Education, 1983).


The educational foundations of our society are presently being eroded by a rising tide of mediocrity that threatens our very future as a Nation and a people.

-National Commission on Excellence in Education (1983) During the Reagan administration, a stunning critique was issued by the National Commission on Excellence in Education (1983). A Nation at Risk was a general indictment of education in America. With little documentation and few specific mandates, this was less a prescription for bettering our schools than it was a rhetorical assault to galvanize the public. Although the charges do not hold up to scrutiny, its effect at the time was enormous.

Richard Allington (2006) is among many who have responded to charges of decline. In his book What Really Matters for Struggling Readers, he argues that the state of literacy in America has been seriously misrepresented. "At all grade levels," he asserts, "children today outperform children from earlier eras of American schooling" (pp. 7-8). Allington is concerned about the persistent population of low-performing students and about meeting the increased literacy demands of the information age. But he also argues that hysteria over misinterpreted test data will not serve us well, and he questions the motives of those who incite alarm.

Another challenge to the charges of A Nation at Risk comes from the academic team of David Berliner, an educational researcher, and B. J. Biddle, a psychology professor. In their book The Manufactured Crisis (1995), they counter what they deem an unprecedented attack on public schools by the American government.

For example, the claim was made that SAT scores had undergone a "virtually unbroken decline from 1963 to 1980" (National Commission on Excellence in Education, 1983, para. 11). This appears to be a cause for concern, but Berliner and Biddle (1995) point to changes in the population of students who are taking the test. In the 1950s, the SAT was taken mostly by top students; in some parts of the country, virtually only private school students took the test. Society's great push to broaden opportunities brought annual increases in the proportion of students who took the test. When scores are disaggregated by high school class rank, they are strikingly consistent across the decades (Berliner & Biddle, 1995). Reading specialist Lucy Calkins has criticized those who have suggested declines in literacy based on SAT scores, whether they have done so "in ignorance or for reasons that must be considered suspect" (Calkins, Montgomery, & Santman, 1998, p. 37).

Charges of a rising tide of mediocrity likewise cannot be supported by the government's own educational statistics. For over 40 years, the National Assessment of Educational Progress (2005) has sampled performance of America's students at ages 9, 13, and 17. The graph for reading performance from 1971 to 2004 reveals little change. Further evidence is found in the fact that the Dale-Chall and Spache readability formulas were renormed in the 1980s to better reflect grade equivalency. In effect, kids were doing too well, and the formulas were overestimating the difficulty of texts (Allington, 2006).


Nevertheless, despite a lack of documentation, the assault on the state of literacy and the public education system continued, and the theme of decline gradually came to be viewed as established truth. The first Bush administration continued Reagan's critique of schools with an education summit that brought together the nation's governors. The initiative of this group, called America 2000, called for a national system of assessments based on broad standards (Vinovskis, 1998). Clinton's Goals 2000: Educate America Act continued and broadened this work, and although mandates were not yet a part of policy, these summits turned the debate to the notion of accountability and envisioned measurement and assessment as the answer to the shortcomings of the current educational system (U.S. Department of Education, 1999).


The rhetoric of crisis proved effective in motivating the public to accept new mandates. Political scientist Deborah Stone (2001) has written about the ways that symbols get manipulated, including the way that images of decline and control can be interwoven: "Stories that move us from the realm of fate to the realm of control are always hopeful, and through their hope they invoke our support" (p. 143). Consistent with this theory, many in the reading community reacted hopefully to the criticisms of our literacy standards, even while recognizing their flaws. Reading specialists found themselves in a difficult position: Whereas they knew that the rhetoric around literacy decline was specious, their response was complicated by a recognition that the literacy needs of Americans were far from being met. As expressed by International Reading Association head Cathy Roller (2005), "While reliable data suggest that reading achievement has been stable since 1970 . . . no one, including most educators and reading professionals, is satisfied with those levels of achievement" (p. 259).

A second factor in countering the rhetoric was that many reading professionals saw the increased attention to reading as a potential gain to be exploited for the good of literacy education. In this case, however, the attention led to increased control from outside the reading profession; namely, the debate among researchers and practitioners was joined by interest groups from business, media, think tanks, and the general public. What began as generic charges became an all-out assault on reading curriculum, with explicit proposals for a course of action to reverse the declared crisis.


Following the education summits of 1991 and 1994, the stage was set for the government to enter the instructional arena. Although the high-stakes standardized testing of No Child Left Behind (NCLB) has been more in the public eye, the effect of another government effort, the National Reading Panel (NRP), needs to be considered for the impact that it has had on classroom assessment practices.

The NRP was organized in 1997 under the National Institute of Child Health and Human Development (N1CHD) and charged to conduct an assessment of reading instruction techniques to inform reading instruction practice in the schools. Their work has had significant impact. First, it extended the rhetoric of crisis and failure. Congresswoman Ann Northup, for example, who joined the congressional push to form the panel, referred to the staggering statistics of reading failure that necessitated the study (N1CHD, 2003a). Second, the NRP, under director G. Reid Lyon, sought to exclude all qualitative studies, spurring notions that only experimental scientific data had validity and could thus guarantee effective instruction (N1CHD, 2003b). In criticism of this approach, neurologist and educator Steven Strauss (2001) deplored such a narrow focus, particularly, the potential for distortion in the popular media, where the research terms valid and reliable are thought to connote objectivity and excellence. The rejection of all descriptive correlational research created the serious limitation, according to Strauss, of taking reading out of an authentic environment. "Real reading materials," he pointed out, "bear no obvious similarity to experimental stimuli" (p. 26). These concerns are not merely academic. The recommendations of the NRP report have been used to promote an assessmentdriven reading environment. A direct line can be drawn from Lyon's work on the NRP to the Reading First initiative of the NCLB Act, touted as scientifically based reading instruction (U.S. Department of Education, 2002). The language of the NRP and NCLB invoke science as the key to success, an objective way to ensure equity and accountability. The NCLB parents' guide boasts of moving "the testing of educational practices toward the medical model used by scientists to assess the effectiveness of medications, therapies and the like" (U.S. Department of Education, 2003, p. 18). Strauss (2001), himself a scientist, strongly disagrees, stating that "it will simply not do ... to redefine reading as one of these elementary tasks" that will not require us to think broadly about "real reading" (p. 30).

The emphasis on quantifiable measurement has led to confusion about the role of silent reading in literacy education. Whereas the NRP report acknowledges the considerable body of evidence that students who read more, read better, the official stance was that silent reading had not been shown by quantitative measures to be of benefit (N1CHD, 2003b). In effect, government reports and policies are powerfully altering reading instruction in America's schools, with more emphasis on quantitative methods and measurements, through mandated high-stakes tests and through a diffuse impact on classroom practice. Across the board, the heat is on to measure and document success.



Repeated formal and informal assessment will help establish a more valid and complete profile of the child's literacy knowledge and skills.

-U.S. Department of Education (2003)

The government policies and reports constitute a powerful use of rhetoric, joining the notion that schools were in critical need, with appeals to faith in the power and objectivity of science. In addition, terms such as high standards, equity, and accountability have been used repeatedly (e.g., U.S. Department of Education, 2003). However, such terms are conceptual ideals, unassailable but vaguely defined goals. Few have challenged the concept of accountability, but the means to achieve it have been more controversial.


The data collected through testing offer a convenient means for oversight and, as promised, dissemination to the public. The resulting public exposure has had a powerful effect. Calkins (Calkins et al., 1998) was dismayed when the "local newspaper published a 'dishonor roll' of schools" (p. 2). She expressed concern about how this practice would affect teachers. Since that time, it has become routine for the media to provide extensive coverage of test results. Reports of adverse effects on teachers' practices have been reported, and a striking example in a Texas school was documented in a recent study (Booher-Jennings, 2005). Intense focus on the data was shown to lead to what Booher-Jennings (2005) calls educational triage, in which teachers concentrate instructional efforts on those students thought to be most likely to have their scores rise to the recognized range, while giving up on those whose scores are "just too low" (p. 243). Accountability can clearly interfere with, rather than ensure, equity.


This era of high-stakes assessments is fraught with other problems besides the pressure on scores. The test instruments themselves have raised questions, attributed to several factors. First, by mandate, assessments must be connected to curriculum. This seems like a good thing, but it has been shown that the curriculum gradually becomes nan-owed to fit the demands of the test. For example, reading standards typically address a range of reading behaviors, many of which cannot be assessed through laboratory-like quantitative methods. Basic decoding skills are easiest to test, and for this reason, one argument is that an emphasis on testing leads to "narrowing of the curriculum, overemphasis on basic skills, [and] excessive time spent in test preparation" (Buly & Valencia, 2002, p. 220).

A second problem with high-stakes assessments lies in the difficulty of balancing reliability and validity, and this is particularly acute for reading assessment. Research studies of state- mandated reading tests in Connecticut (Spear-Swerling, 2004) and Washington (Johnson, Jenkins, & Jewell, 2005) have pointed to problems related to the relationship of reading and writing and to the lack of a consensus on reading development. The linking of standards and assessment has led to an increased use of tests that combine multiple-choice and constructed-response formats because of their greater connection to demonstrated skill in constructing meaning. Yet for testing validity purposes, confounding effects of written performance on measures of reading have been documented and "may introduce changes to the construct of reading comprehension" (Johnson et al., 2005, p. 268).


Critics of high-stakes tests have called for assessments that not only reflect school and district standards but measure in ways that are familiar to students. (Afflerbach, 2004; American Psychological Association, 2001; International Reading Association, 1999). Many states have chosen to develop such tests, but they cost considerably more and demand more school time to administer. Less expensive, off- the-shelf tests have been promoted by some (Hoxby, 2001), but although these tests demand less time to administer, they are a crude measure of district standards. Educators must choose how best to balance budgetary and educational restrictions against the requirements of NCLB.


Teachers are caught in an unsettling situation because high- stakes tests often conflict with the educational ideals that inform their reading programs. In explaining her decision to create a guide for teachers who are dealing with standardized reading tests, Calkins and coauthors (1998) note that

standardized reading tests are assuming an increasingly powerful role in classrooms and schools. My colleagues and 1 could no longer turn away from the fact that we need to do a better job of helping teachers and principals live under their shadow. (p. 1)

Responding to test scores has in fact led to dramatic changes in the way that teachers approach reading. In a study of classroom changes in response to students' poor performance on Washington State's high-stakes assessment, Buly and Valencia (2002) found that teachers were ending sustained silent reading to free more time for direct instruction. Overall, curriculum are effects on the reading curriculum are extensive.


The climate created by NCLB means not just annual tests but a general increase in quantitative assessment of student progress. Those who teach reading have an especially heavy burden in the wake of the NRP report and "unprecedented political insistence on the use of research-based, scientifically proven assessments and instructional techniques" (lnvernizzi, Landrum, Howell, & Warley, 2005, p. 610). A proliferation of tools is currently being marketed to fill the demand for periodic summative assessments of students.

In their work with students, however, teachers need diagnostic tests to shape instruction to suit their students. Qualitative assessments can help provide a more accurate portrait of a child's reading behavior than what a standardized test can yield. In effect, then, teachers must choose to give up the diagnostic tests needed for an individualized approach or devote sufficient time for not only diagnostic tests to inform practice but also summative tests to provide mandated data and teach students good test-taking practices.

An alternative available to primary grade teachers is to adopt the instructional design of the government-sponsored Reading First Initiative. In a commentary in Childhood Education, Gordinier and Foster (2004) concluded that in this approach, assessment drives instruction, eliminating any time to provide assessments other than those that align with the approved scientific practices; furthermore, "'Big Brother's' ideas in Washington are dictating tests to be used for measuring progress as well as the instructional materials" (p. 94).

Similar time constraints influence classroom teachers beyond the primary grades as well. Pressured to show adequate yearly progress, teachers analyze student data, then shape instruction to achieve gains. Schools often feel pressed to create or purchase additional testing materials to document progress. Because in some cases the very companies that market the tests produce instructional materials, teachers can rapidly find themselves adapting to "standardized, highly structured curricula or following pacing schedules that dictate what page of the text each class should be on for each day of the school year" (Falk, 2000, p. 94). Thus, the entire school community can be pulled into a literacy program that is essentially assessment driven and makes demands on classroom time that conflict with time for free reading and even library visits.


All this literacy promotion may be missing the boat. Data show that America's literacy rate is high, yet many argue that there is a literacy problem. Krashen (2004) concludes that "nearly everyone in the United States can read and write. They just don't read and write well enough" (p. x). Allington (2006) identifies a challenge to educators in the fact that they "create more students who con read than who do read" (p. 10). Jim Trelease (as cited in Calkins et al., 1998) expresses the same concern that our children are just not choosing to read. He cites "a study of 25,000 literate fifth graders that shows that they spend 33 percent of their free time watching television and 1 percent of their free time reading" (p. 36). As has been shown in the research of Lance (2004), one way to encourage higher reading scores and a greater love of reading is through strong library programs. THE VIEW FROM THE LIBRARY

Schools with higher rated school libraries have 10 to 18 percent better test scores than schools with lower rated libraries.

-Lance (2004)

Successful library programs vary because they reflect the characteristics of the schools they serve. No formula can be offered. There are, though, some broad considerations that aid in maintaining a thriving program and are thus offered here as helpful actions to combat these difficult times.


Encouraging a joy in reading is paramount and something that school librarians are ideally suited to address. Making the library a comfortable place where reading is nonjudgmental and where students get help finding what they love is important. Given the assessment-driven climate, more of this is needed.


Booktalks are appreciated by everyone and can be tailored to fit a limited time frame. If teachers cannot find the time to bring the class to the library, offer to drop by the classroom for a booktalk. One good addition to the booktalk session is a brief form for students to record books that they hear about that interest them. Keep the forms in binders in the library, and remind students that the books that they want are awaiting them.

If schedules permit it, offer to help support student reading groups, especially for the most advanced readers and for the struggling readers. Teachers find that it is difficult to serve all the needs of their students, and they will be relieved to have help. Teachers may be overly preoccupied with book levels. Although this is a disturbing trend, particularly with materials above the primary level, it is a good idea to be familiar with some of the lists because it builds credibility with the teachers. If there is a literacy coach in the school, seek him or her out and find ways to be a part of the literacy work at the school.

Curriculum frameworks need to be translated into authentic learning activities, and librarians can aid in this objective by building and promoting collections of books, media, and realia that invite teachers' use. When teachers are hard to reach, seek out specialists in the building-for example, music, art, drama, language, or physical education teachers. By joining forces, you may be able to formulate curriculum ideas that are irresistible to teachers.


Bear in mind that parents are affected by the emphasis on assessment. Their communication with teachers may be inordinately focused on measured reading level. Parent outreach can provide an antidote needed by parents who are anxious and confused about what their children need. A simple brochure with recommended read-alouds and excerpts from reading specialists such as Trelease (2006) and Allington (2006) will be welcomed. Invite parents to use the school library. Institute a special school event for parents to promote the positive role of the family in encouraging a love of reading.


Celebrate special days, special places, and special people. Consider who and what in the community can be tapped for resources, collaboration, or even a visit, especially when a curricular connection can be made. Document everything in media and share those events throughout the community.


Finally, participate in the ongoing effort to define and document the goals of the school. Whether global (such as the mission statement) or specific (such as curricular design work), such efforts are an opportunity to promote the highest ideals of literacy and information skills while giving visible support to the school community.

The controversy over reading assessment is likely to continue for some time, but by drawing on these varied areas of outreach, strong library programs can help to ease the effects of the reading assessment paradox.

Feature articles in 71 are blind-refereed by members of the advisory board. This article was submitted January 2007 and was accepted July 2007.


Afflerbach, P. (2004). National Reading Conference policy brief: High stakes testing and reading assessment. Retrieved April 14, 2006, from www.nrconline.org/publications/ HighStakesTestingandReadingAssessmentdoc

Allington, R. (2006). What really matters for struggling readers: Designing research-based programs (2nd ed.). New York: Longman.

American Psychological Association. (2001). Appropriate use of high-stakes testing in our nation's schools. APA Online. Retrieved April 13, 2006, from www.apa.org/ pubinfo/testing.html

Berliner, D. C., & Biddle, B. J. (1995). The manufactured crisis: Myths, fraud and the attack on America's public schools. Reading, MA: Addison-Wesley.

Booher-Jennings, J. (2005). Below the bubble: "Educational triage" and the Texas accountability system. American Educational Research Journal, 42(2), 231-268.

Buly, M. R., & Valencia, S. W. (2002). Below the bar: Profiles of students who fail state reading assessments. Educational Evaluation and Policy Analysis, 24(3), 219-239.

Calkins, L., Montgomery, K., & Santman, D. (1998). A teacher's guide to standardized reading tests: Knowledge is power. Portsmouth, NH: Heinemann.

Falk, B. (2000). The heart of the matter: Using standard and assessment to learn. Portsmouth, NH: Heinemann.

Gordinier, C. L., & Foster, K. (2004). What stick is driving the Reading First hoop? Childhood Education, 81(2), 94.

Hoxby, C. M. (2001). Conversion of a standardized test skeptic. Stanford, CA: Hoover Institution. Retrieved April 4, 2006, from www.hoover.org/pubaffairs/we/current/hoxby_0601.html

International Reading Association. (1999). High-stakes assessments in reading: A position statement of the International Reading Association. Newark, DE: Author. Retrieved April 4, 2006 from www. reading.org/downloads/positions/ps1035_high_stakes.pdf

Invemizzi, M. A., Landrum, T. J., Howell, J. L., Et Warley, H. P. (2005). Toward the peaceful coexistence of test developers, policymakers, and teachers in an era of accountability. Reading Teacher, 58(7), 610-618.

Johnson, E. S., Jenkins, J. R., & Jewell, M. (2005). Analyzing components of reading on performance assessments: An expanded simple view. Reading Psychology, 26(3), 267-283.

Jones, J. M. (2005, September 8). Slim majority dissatisfied with education in the U.S. The Gallup Poll. Retrieved April 14, 2006, from http://poll.gallup.com/content/default.aspx?ci=18421&pg=1

Krashen, S. D. (2004). The power of reading: Insights from the research. Portsmouth, NH: Heinemann.

Lance, K. C. (2004, Winter). Libraries and student achievement: The importance of school libraries for improving student test scores. Threshold, pp. 8-9.

McQuillan, J. (1998). The literacy crisis: False claims and real solutions. Portsmouth, NH: Heinemann.

National Assessment of Educational Progress. (2005). National trends in reading by average scale scores. Washington, DC: National Center for Education Statistics. Retrieved October 18, 2006, from http://nces.ed.gov/nationsreportcard/ltt/results 2004/nat-reading- scalescore.asp

National Commission on Excellence in Education. (1983). A nation at risk. Retrieved April 4, 2006, from www .ed.gov/pubs/NatAtRisk/ risk.html

National Institute of Child Health and Human Development (NICHD). (2003a). Press releases and congressional testimony. Retrieved October 12, 2006, from www.nationalreadingpanel.org/Press/ press_rel_northup.htm

National Institute of Child Health and Human Development (NICHD). (2003b). Report of the National Reading Panel. Retrieved November 2, 2007, from www.nichd.nih.gov/publications/nrp/upload/report_pdf

Roller, C. (2005). The International Reading Assodation responds to a highly charged policy environment. In P. Shannon & J. Edmondson (Eds.), Reading education policy: A collection of articles from the International Reading Association (Chap. 14, p. 259). Newark, DE: International Reading Association.

Spear-Swerling, L. (2004). Fourth-graders' performance on a state- mandated assessment involving different measures of reading comprehension. Reading Psychology, 25(2), 121-148.

Spellings, M. (2006). Ask the White House, April 20, 2006. Retrieved April 25, 2006, from www.whitehouse.gov/ask/20060109.html

Stone, D. (2001). Policy paradox: The art of political decision making. New York: W. W. Norton.

Strauss, S. (2001, June/July). An open letter to Reid Lyon. Education Researcher, pp. 26-33.

Trelease, J. (2006). Chapter 6: Libraries-Home school public. In The read-aloud handbook. Retrieved April 20, 2006, from www.trelease- on-reading.com/rah_chpt6_p1.html

U.S. Department of Education. (1999). Progress of education in the United States of America-1990 through 1994. Retrieved April 18, 2006, from www.ed.gov/pubs/Prog95/index.html

U.S. Department of Education. (2002). Reading First. Retrieved April 4, 2006, from www.ed.gov/programs/readingfirst/index.html

U.S. Department of Education. (2003). No Child Left Behind: A parent's guide. Retrieved April 13, 2006, from www.ed.gov/parents/ academic/involve/nclb guide/parentsguide.pdf

Vinovskis, M. A. (1998, November). Overseeing the nation's report card: The creation and evoluton of the National Assessment Governing Board (NAGB). Washington, DC: National Assessment Governing Board. Retrieved April 18, 2006, from www.nagb.org/pubs/95222.pdf Laurie Cleveland, a school librarian, is currently a graduate student at Boston College's School of Education. She may be reached at [email protected]

Copyright Ken Haycock & Associates Dec 2007

(c) 2007 Teacher Librarian. Provided by ProQuest Information and Learning. All rights Reserved.