October 1, 2008
Was That a Result of My Teaching?
By Misco, Thomas
Abstract: In this article, the author explores the challenges and promises of value-added assessment. As NCLB approaches the end of statistically possible achievement gains in schools, value-added assessment is being employed to longitudinally measure student learning to determine a school's effect. Yet, value-added assessment is limited in its explanatory powers because it focuses only on certain types of knowledge and needs to be used in conjunction with other estimates. As such, the author provides a variety of perspectives to help educational stakeholders explore the assessment not just as a new test but rather as a promising and potentially damaging lever of change in school cultures. Keywords: No Child Left Behind Act, testing, valueadded assessmentThe enactment of the No Child Left Behind Act (NCLB) in 2002 was not an isolated attempt to reform education, and the current political climate suggesting its early demise will not undermine the interest in and proliferation of state standards and highstakes accountability mechanisms. Interest in standards as a tool for school improvement has dramatically increased since the 1980s (Porter 1994), and as one manifestation becomes impractical or goals become mathematically improbable, the testing, standards, and accountability movement will evolve to a different form using different tools and exacting different goals. For many, NCLB has been an underwhelming instrument for accountability. Yet, the desire to measure and reach excellence and proficiency remains. As a result, there is increased interest in other forms of assessment that avoid some of the problems NCLB faces.
Value-added assessment, the next generation of accountability, asks not what the school or district achievement data are, but whether a particular school, classroom, and teacher did what they were supposed to do for the achievement growth of individual students. As more educators find themselves in this new testing and accountability paradigm, they need to examine this form of assessment through an open-minded yet critical lens. For example, 2,500 teachers in New York City are currently part of a controversial experiment that could result in teacher tenure, performance evaluations, and bonuses tied to student improvement on standardized tests (Medina 2008). In addition, value-added assessments are already in place in Tennessee, and pilot projects are unfolding in Ohio and other states. Therefore, my intention in this article is to provoke some initial dialogue and reflective consideration of the assessment among preservice and in-service teachers, as well as teacher educators, and to act as a point of departure for subsequent inquiry and nuanced consideration of the promises and challenges of value-added assessment.
Responding to the Weaknesses of NCLB
As NCLB approaches the end of statistically possible achievement gains in schools, policymakers will need to institute a new mechanism to drive further policy decisions based on testing data if accountability-related policies are to continue to highly influence education. In many ways, value-added assessment responds to this distress tocsin by changing the unit of analysis from schools and districts to individual students. Almost all schools will eventually fail to meet NCLB targets (Hershberg, Simon, and Lea-Kruger 2004), and the data derived from most of its measures does not help refine curriculum or practice. Instead, these data are often uninformative or unreliable but attractive in their simplicity, even when injurious to students and teachers (Doran 2003). One of the early criticisms of NCLB was its focus on cohort-to-cohort analyses and lack of longitudinal data collection, both of which would allow defensible conclusions about progress and effectiveness (Brennan 2004). In addition, many felt it resulted in a myopic focus on improving achievement among students on the cusp of proficiency and ignored gifted and talented subgroups. Moreover, comparison of state designated proficiency levels with national external measures derived from National Assessment of Educational Progress data (McCombs et al. 2004) demonstrated significant gaps between measures of proficiency, thereby further undermining the value of data for making any meaningful statements about student learning. Value- added assessment is, therefore, gaining credibility in a number of districts and states as a way to respond to an array of factors leading to NCLB's malaise.
The basic premise of value-added assessment is to longitudinally measure student learning to determine a school's effect. To do this, prior achievement is normalized and charted in such a way that projects achievement in the form of a trajectory for future learning. In a value-added assessment system, any actual achievement over the projected growth rate is attributed to the educational setting. Similarly, deviation below the projection, which we might call negative value-added, is also attributed to the school, the classroom, and the teacher. This form of assessment focuses on inputs and outcomes, whereby each student acts as his or her control and past scores serve as a proxy for external variables (Doran 2003). This way, gains are not compared with peers but rather the student's past performance (Schaeffer 2004), thereby foiling one of NCLB's weaknesses. Moreover, because the focus is on individual students, not just those close to proficiency, value-added assessment provides greater potential for teachers to identify weaknesses in curriculum, assist students who are slipping from projected achievement levels, and enact appropriate modifications to their classroom.
The idea of value added is not a new one-it first gained prominence in agriculture (Bryk and Weisberg 1976) and became commonplace in British educational institutions decades ago (Strand 1997). Recently, better techniques for collecting and analyzing individual student data have made value-added assessments possible in the United States, thereby energizing their powerful and potentially transformative effects. Ultimately, each iteration of value-added assessment is unique to the state or context in which it is employed (Schagen and Hutchison 2003), and each manifestation has varying strengths and weaknesses.
Promises and Potentialities
One of the most promising features of value-added assessment is its longitudinal component. By measuring student achievement over time, teachers and administrators can focus on quality of education rather than reputation, resources, or other variables that might lead to high school-wide or district-wide achievement scores. Rather than make reactionary changes to macro curriculum or close down schools on the basis of test data, value-added assessment can lead to improved programs, teacher responsiveness to individual student learning needs (Doran 2003; Hershberg, Simon, and Lea-Kruger 2004; McMillan 1988), and curricular changes in individual classrooms based specifically on students' achievement level data.
Achievement data and trends derived from valueadded assessments are more meaningful for educational change than school-wide data, and they beget a process more equitable then simply measuring raw scores (Hoyle and Robinson 2002). For example, it is unrealistic to immediately make up long-standing deficiencies among individual students in low-performing schools. Value-added assessment recognizes this reality and instead focuses on progress toward proficiency, whereby each year of learning serves as a key step in that progression (Mahoney 2004). The focus on outcomes and individual growth avoids micromanagement of schools and reflects the assumption that both students and schools are responsible for achievement (Porter 1994). After all, individual teachers are effective indicators of students' achievement, perhaps serving as a more powerful predictor of growth than family income (Hershberg, Simon, and Lea-Kruger 2004) because teachers provide the climate, time, activities, and conditions of learning (Lasley, Siedentop, and Yinger 2006). Data derived from value-added assessment can aid parents in making informed choices about schools because higher or lower scores would no longer be equated with better or worse schools. These abstract labels gloss over student populations that can often dictate high achievement results despite a classroom lacking any sense of quality (Strand 1997).
Challenges and Weaknesses
Despite the promising attributes of value-added assessment for responding to NCLB's failures and providing meaningful data on student learning, there is a range of caveats teachers, parents, and administrators need to consider. First, the temptation to view classroom-specific data and immediately form inferences and conclusions based on those data exists, moving beyond intended changes in instructional practice and into the world of teacher evaluation. Using value-added assessment to make personnel decisions is not only highly controversial (Schaeffer 2004) but also far removed from the intention of value-added assessments (Hershberg, Simon, and Lea-Kruger 2004). Primarily, this is due to the extreme difficulty involved in probabilistically connecting student achievement gains and regressions to teachers and isolating their effect on students. For example, in value-added assessment systems, weak teachers in weak schools look better than similarly weak teachers in strong schools (Kupermintz 2003), and we have yet to see an empirical study that suggests teacher effectiveness can even be isolated (Doran and Fleischman 2005). Last, good teaching pays dividends in subsequent years (Hershberg, Simon, and Lea-Kruger), which might mean that an uptick above projected performance is not the result of a particular experience with one teacher, but rather due to exposure to extraordinarily high or low quality teaching in a previous year. Single point appraisals are intrinsically fallible, and there is a real fear that value-added assessment, given its perceived simplicity and narrowly tailored objective, might become the sole measure of teacher quality. Any assessment needs to be responsive to the needs and goals of individual communities and be part of a larger grouping of multiple measures (Law 1999), including qualitatively oriented assessments that account for gains in creativity, exploration, and curiosity (Johnson 2006), as well as gains in certain dispositions (Misco 2007). The entire structure of value-added assessments rests on the assumption that test scores actually indicate educational value (Kupermintz 2003). Value-added assessment is, therefore, limited in its explanatory powers because it is only an estimate of school effect of certain kinds of knowledge; it needs to be used in conjunction with other estimates (Hoyle and Robinson 2002) so that we do not solely privilege what is on the test, sometimes the only source of data for the valueadded calculations.
If policy decisions are made primarily on the basis of this assessment, educators need to determine whether the measure that feeds into value-added trajectories is comprehensive enough to reflect what students should know and be able to do. Are the single data points resting on shaky measures, such as state achievement tests, which have content and instructional validity issues (McMillan 1988)? Because determining an added value among students differs significantly from other dependent variables-we are looking for not only achievement growth but also skill and disposition growth-content validity is a major concern. In addition, value- added assessment can be corrosive to the school culture by further driving the apotheosis of content knowledge over any other curricular aim or goal.
There is little doubt that value-added assessment can identify progress, but the means of realizing that progress (or lack thereof) is not as clear-cut as we think (Strand 1997). One of the selling points of value-added assessment is the use of the individual student as his or her control. As a result, socioeconomic status, number of books in the home, geographic residence, and other variables are considered controlled because there is a record of past achievement already derived from these and other factors. Therefore, it is assumed that these outside variables remain somewhat constant in terms of their effect, thereby providing an opportunity to measure teacher influence. But in some situations, prior achievement is not a good predictor of future outcomes (Hoyle and Robinson 2002). Problems associated with the inductive fallacy intersect with stages of student development that are not linear. Rather, there is tremendous complexity involved in estimating the rate of gain, and when we assume that future performance is predicated by past achievement, we ignore a wide range of influences that continue to effect achievement independent of earlier achievement, including but not limited to socioeconomic standing (Wiliam 1992).
In short, the baseline data that serve as the departure point for the student's trajectory may ignore the nuances of variables external to the classroom and subsequent assessment measures because the data are tied only to past performance and can continue to ignore these variables for years. Last, tests change yearly, and it is easy to overlook the data point that gave rise to the trajectory. Value-added assessment assumes we can measure growth in learning between measures. Instead, perhaps we should focus on contribution (similar to impact) to student learning (Doran 2003).
The causal inferences that value-added assessment ultimately suggest are, in some ways, illegitimate when other variables are influencing learning (Bryk and Weisberg 1976). Using the terminology of impact and change often presumes causation, which is perhaps the most dangerous and shortsighted leap among quick readings and interpretations of value-added assessment. For example, family background, maturation, school mobility, socioeconomic status, and an array of other variables (including actual instruction) impact student achievement growth. If we simply suggest that the school effect is what is left over when subtracting earlier attainment, we are making an inference that is too heavily based on correlations and that does not explain why certain variables are correlated. In this way, interpretation of value-added data does not account for school role in larger societal milieu, and it subsumes political forces and other contextual features, thereby attributing disproportional weight to schools (Schagen and Hutchison 2003). The assumption that curriculum in subsequent years is a harder version of what came earlier, when in reality there is yearly variability and content is often qualitatively different, further problematizes this tendency. This paradigmatic curricular reality undermines the rationale for vertically scaling achievement growth in a trajectory form (Doran and Fleischman 2005).
In addition to these challenges, the composition of the student body in a school can have a dramatic influence on individual outcomes-more so than prior achievement or background (Strand 1997). As a rising tide lifts all boats, so too does learning in an environment in which most peers are high achievers. This synergy of high achievement and growth in a cohort leads to increased individual scores (Wiliam 1992), but in a value-added assessment, its presence or absence might be attributed to the teacher.
Value-added assessment has a number of potentialities and problems (Lasley, Siedentop, and Yinger 2006, 19). Although most value-added assessment systems are unique, they all demand teachers', parents', and administrators' vigilant and critical examination. There is tremendous opportunity for using this form of assessment to improve conditions for student learning, but it is essential to do so with a balanced approach to what the assessment data suggest versus reactionary policy decisions without equanimity. Educational stakeholders need to apply a healthy skepticism to the assessment results and consider how all variables are influencing student achievement and growth if we expect sound conclusions (Holloway 2000), especially those that rely on personnel decisions. Educators need to focus on the value part of this assessment (McMillan 1988) and thoroughly investigate the assessment as not just a new test or a new way of operating, but rather as a promising but potentially damaging change in school cultures. In addition to the hazards valueadded assessment pose to teacher retention and tenure, perhaps its most prominent threat is a continuation of NCLB's focus on content knowledge above all other curricular components, thereby further undermining the central purpose and function of schooling in our democratic society.
Brennan, R. L. 2004. Revolutions and evolutions in current educational testing, 1-26. Occasional Research Papers. Des Moines: Iowa Academy of Education.
Bryk, A. S., and H. I. Weisberg. 1976. Value-added analysis: A dynamic approach to the estimation of treatment effects. Journal of Educational Statistics 1 (2): 127-55.
Doran, H. C. 2003. Adding value to accountability. Educational Leadership 61 (3): 55-59.
Doran, H. C., and S. Fleischman. 2005. Challenges of value-added assessment. Educational Leadership 63 (3): 85-87.
Hershberg, T., V. A. Simon, and B. Lea-Kruger. 2004. The revelations of value-added. The School Administrator 61 (11): 10- 14.
Holloway, J. H. 2000. A value-added view of pupil performance. Educational Leadership 57 (5): 84-85.
Hoyle, R. B., and J. C. Robinson. 2002. League tables and school effectiveness: A mathematical model. The Royal Society 270:113-19.
Johnson, A. P. 2006. No child left behind: Factory models and business paradigms. The Clearing House 80 (1): 34-36.
Kupermintz, H. 2003. Teacher effects and teacher effectiveness: A validity investigation of the Tennessee Value Added Assessment System. Educational Evaluation and Policy Analysis 25 (3): 287-98.
Lasley, T. J., D. Siedentop, and R. Yinger. 2006. A systemic approach to enhancing teacher quality. Journal of Teacher Education 57 (1): 13-21.
Law, N. 1999. Value-added assessment and accountability. Thrust for Educational Leadership 28 (3): 28-31.
Mahoney, J. W. 2004. Why add value in assessment? A pilot project in Ohio gains interest in value-added measurements of individual students. School Administrator 61 (11): 16-18.
McCombs, J. S., S. N. Kirby, H. Barney, H. Darilek, and S. Magee. 2004. Achieving state and national literacy goals: A long uphill road. Santa Monica, CA: Rand.
McMillan, J. H. 1988. Beyond value-added education: Improvement alone is not enough. Journal of Higher Education 59 (5): 564-79.
Medina, J. 2008. New York measuring teachers by test scores. New York Times, January 21.
Misco, T. 2007. Did I forget about the dispositions? Preparing high school graduates for moral life. The Clearing House 80 (6): 267- 70.
Porter, A. C. 1994. National standards and school improvement in the 1990s: Issues and promise. American Journal of Education 102 (4): 421-49.
Schaeffer, B. 2004. Districts pilot value-added assessment: Leaders in Ohio and Pennsylvania are making better sense of their school data. School Administrator 61 (11): 20-24. Schagen, I., and D. Hutchison. 2003. Adding value in educational research: The marriage of data and analytical power. British Educational Research Journal 29 (5): 749-65.
Strand, S. 1997. Pupil progress during key stage 1: A value added analysis of school effects. British Educational Research Journal 23 (4): 471-87.
Wiliam, D. 1992. Value-added attacks: Technical issues in reporting national curriculum assessments. British Educational Research Journal 18 (4): 329-41.
Thomas Misco, PhD, is an assistant professor of social studies education in the Department of Teacher Education at Miami University, Oxford, Ohio.
Copyright (c) 2008 Heldref Publications
Copyright Heldref Publications Sep/Oct 2008
(c) 2008 Clearing House, The. Provided by ProQuest LLC. All rights Reserved.