Quantcast
  • E-mail
  • Print
  • Comment
  • Font Size
  • Digg
  • del.icio.us
  • Discuss article

Effectiveness of a Highly Rated Science Curriculum Unit for Students With Disabilities in General Education Classrooms

Posted on: Wednesday, 10 January 2007, 03:00 CST

By Lynch, Sharon; Taymans, Juliana; Watson, William A; Ochsendorf, Robert J; Et al

ABSTRACT:

This research is part of a study on scaling-up middle school science curriculum units in a large, diverse public school system. Chemistry That Applies (CTA), a guided inquiry unit based on conceptual change theory and highly rated according to the Project 2061 Curriculum Analysis, was implemented in five middle schools matched demographically with five comparison schools (N = 2,282 students). Eighth grade CTA students outscored their peers overall and when data were disaggregated, with small to medium effect sizes. Of particular interest are students with disabilities in general education science classrooms (n = 202 students with complete assessment records). Those who used CTA significantly outscored their comparison peers on the posttest, with a small to medium effect size.

Scaling Up Curriculum for Achievement, Learning, and Equity Project (SCALE-uP) is a collaborative research endeavor between researchers at the George Washington University and Montgomery County Public Schools (MCPS), Maryland. SCALE-uP is funded by the Interagency Education Research Initiative (IERI) and administered by the National Science Foundation (NSF). The goal of SCALE-uP is to introduce three different reform-based science curriculum units that have relatively high ratings on specified instructional attributes to middle school students and to study their scale-up in a large and diverse school district. Scaling-up is traditionally defined as the deliberate expansion to many settings of an externally developed intervention that has previously been used successfully in one or a small number of school settings (Coburn, 2003). The units under investigation were developed by three different reform-oriented science education organizations, rather than by SCALE-uP.

SCALE-uP's study of each science curriculum unit has two stages that take a total of 4 years to complete. In the first stage, a unit was introduced to MCPS students in five demographically diverse middle schools, and learning outcomes were compared to those of students in five matched comparison schools to determine the unit's effectiveness. The following year, the study was replicated with the same set of schools. These stage one explorations were called implementation studies; their purpose was to test the effectiveness of a curriculum unit overall, but also to determine if it had differential impact on various subgroups of students in the sample. This allowed an evaluation of the curriculum unit sensitive to the needs of diverse learners. Although each unit selected for this study had been field-tested by its developers and is in extensive use in the United States, none were ever evaluated using comparative methods that relied on experimental or quasi-experimental designs.

If a highly rated curriculum unit was found to be effective overall on outcome measures and when the data were disaggregated by gender, ethnicity, and eligibility for Free And Reduced-price Meals System (FARMS), English for Speakers of Other Languages (ESOL), and special education services, it proceeded to the second scale-up stage. In the 3rd year of the study, the unit was systematically introduced to students at 15 more schools and then scaled-up for all 37 MCPS middle schools in the 4th year of study.

This article reports on the results of a quasi-experimental study of the implementation of one of the units, Chemistry That Applies (CTA; State of Michigan, 1993), during the second year (i.e., replication with the same set of schools during stage one). In this article, we examine the results for students with disabilities, as well as other subgroups of students. We think that a focus on the impact of CTA for students with identified disabilities is especially important, given the longstanding differences of opinion about the best ways to teach science to these students (e.g., Mastropieri & Scruggs, 1992).

THE CHALLENGE OF INVESTIGATING THE EFFECTIVENESS OF SCIENCE CURRICULUM MATERIALS FOR STUDENTS WITH DISABILITIES

HOW DO STUDENTS WITH DISABILITIES LEARN SCIENCE?

Students with one or more identified physical, cognitive, emotional, or other types of disabilities compose 11.5% of the estimated K-12 student population in the United States (U.S. Department of Education, 2002). Students with disabilities generally do not perform as well in science as their peers with no identified disabilities. They receive lower grades in science (Cawley, Kahn, & Tedesco, 1989), and they score significantly lower in mathematics on the National Assessment of Educational Progress (National Center for Education Statistics, 2003).

Why do students with disabilities have difficulty with science learning? It is likely that such students have a number of different cognitive processing or memory problems that lead to trouble with reading (word recognition or comprehension) or mathematics or the ability to link ideas in chains of reasoning. Depending on the individual student and the nature of the science classroom in which he or she is asked to perform, disabilities manifest themselves in different ways. A student with reading decoding problems may have little difficulty understanding abstract science concepts once they are made available in a classroom that does not depend mainly on reading textbooks, whereas other students with disabilities may have difficulty making sense of the abstractions and connections of science concepts no matter the modality in which they are presented.

Fletcher, Coulter, Reschly, and Vaughn (2004) point out that severe underachievement in an academic subject such as science may be the best way of determining who is eligible for special services in that subject. They also note that learning disabilities, for instance, run on a continuum and are variations on normal development, with no natural demarcation points that differentiate persons with a learning disability from others. If this is so, then cognitive models that have been developed for understanding how people learn science, in general, seem applicable for students with disabilities. Thus, conceptual change models of science learning that form the basis for the instruction delivered in this study (and discussed next) should be effective for students with disabilities, as well as other students not so identified.

CONCEPTUAL CHANGE THEORY

Conceptual change theory is a contemporary model of science learning that holds considerable currency among cognitive psychologists and science educators alike. Studies of differences in how scientists and novices think about physical phenomena, organize information, and solve problems (Bransford, Brown, & Cocking, 1999) suggest strongly that science curriculum and instruction must help students make an expert-novice shift. Students' minds are not devoid of ideas and explanations about the physical world and how it works; rather, these novices hold strong views that may not be at all consistent with those of a science expert. Conceptual change occurs when students move from nave, and sometimes tenaciously held, ideas to more scientifically accurate ones. Direct experiences with physical phenomena, alongside of the opportunity to make sense of conflicting ideas, reflect on their experiences, and discuss their views with other students, create fertile conditions for conceptual change (Driver, Asoko, Leach, Mortimer, & Scott, 1994; Driver, Guesne, & Tiberghien, 1985; Posner, Strike, Hewson, & Gertzog, 1982; Strike & Posner, 1992).

SCALE-uP research, which draws on conceptual change theory, is based on these premises:

* Conceptual knowledge is taught and learned in schools.

* Curriculum materials influence instruction and produce a unique effect on the learning of concepts; students' prior conceptions interact with the curriculum materials, affecting new learning or changes in conceptual knowledge.

* Although students with disabilities have a range of cognitive, physical, and emotional functioning, there is no reason to think that students with disabilities learn science differently than other students; prior assumptions about conceptual change apply to these students, as well.

Although conceptual change theory has a substantial research base (cf., chapter 15 of Benchmarks for Science Literacy, American Association for the Advancement of Science, AAAS, 1993), there is less agreement about what this implies for best practice in science curriculum and instruction in general, as well as for students with disabilities. These ideas are explored in the next sections.

HOW BEST TO TEACH SCIENCE TO STUDENTS WITH DISABILITIES

Given the array of cognitive (and perhaps physical and emotional) disabilities that may have an impact on how students learn science, as well as the range of performances that may be demanded by different science curricula in schools across the United States, it is hardly surprising that there has been a great deal of controversy about the best ways to teach students with disabilities. The most recent sets of science standards, the National Science Education Standards (National Research Council, 1996) and Benchma\rks for Science Literacy (AAAS, 1993), present new challenges for students with disabilities. These documents call for science literacy for all students. Students are expected to understand and apply science concepts in real life situations, as well as to perform processes inherent to science, from measuring and estimation to sustained reasoning through scientific inquiry.

There is some research that describes the kinds of science experiences that help students with disabilities learn science concepts, as they also increase their process skills. Students with disabilities who use hands-on or activity-based curriculum materials have been shown to understand and retain science concepts more fully than peers who learn from text-based approaches (e.g., Holahan & DeLuca, 1993; Mastropieri & Scruggs, 1994; Scruggs & Mastropieri, 1993). However, Dalton, Morocco, Tivnan, and Rawson Mead (1997) suggested that simply providing access to manipulation of materials is not sufficient to bring about conceptual change. They found that students with identified learning disabilities in inquiry-centered classrooms demonstrated greater conceptual change than their peers in hands-on classrooms. They learned best when inquiry-centered curriculum materials were supported by the teacher, especially via active questioning techniques.

Palincsar, Magnusson, Collins, and Cutter (2001), using a longitudinal design experiment, found that guided inquiry curriculum materials provided opportunities for students with disabilities to experience conceptual change. These researchers also interpret the enactments of guided inquiry curriculum materials to include participation in a learning community in which expertise is distributed. Students experienced varied ways of communicating what they learned, multiple cycles of investigation, and opportunities to engage in problem solving through lab activities. In these studies, students with disabilities demonstrated substantial conceptual change in mainstream classrooms when inquiry-based curriculum materials were used to teach complex conceptual ideas such as electric circuits or sinking and floating. Palincsar et al. also found that teachers could add to the efficacy of the curriculum units when they used instructional strategies that facilitated student thinking, supported print literacy, and helped students with disabilities to operate more effectively in their lab groups. The work of Palincsar et al. and Dalton et al. (1997) suggests that guided-inquiry curriculum materials facilitate conceptual change for students with disabilities.

Few studies, however, have employed designs that can provide evidence for the effectiveness of specific interventions for students with disabilities (cf., Odom et al., 2005). There seems to be little agreement on how best to teach science to students with disabilities in general, or for subgroups with specific disabilities. This may be due to the array of possible independent variables that alone or in combination constitute an intervention. These kinds of questions may be asked of science instruction intervention studies:

* Are the students asked to learn procedural knowledge, such as how to do inquiry, or is the object of instruction declarative knowledge, such as how matter is conserved in a chemical reaction (e.g., McCleery & Tindal, 1999)?

* Is the intervention focused on curriculum materials, is it focused on instruction delivered by the teacher, or does it consist of a combination of curriculum materials and instructional moves (e.g., Cawley, Hayden, Cade, & Baker-Kroczynski, 2002)?

* Does the intervention involve hands-on learning with multimodal experiences or is it primarily text-based (e.g., Mastropieri & Scruggs, 1994)?

* Is the intervention characterized as inductive, in which some aspects of student inquiry are involved, or is it more deductive and akin to direct instruction (e.g., McCarthy, 2005)?

* Is the duration of the intervention brief such as a day or week, or does it extend over weeks, months, or years (e.g., Mastropieri & Scruggs, 1997; Palincsar et al., 2001)?

The research literature tends to use constructs that are not always well defined, leading to ambiguity about what is actually included in the intervention. Examples include terms such as hands- on, inquiry, discovery, inductive methods, explicit instruction, direct instruction, or activity-based learning. The lack of specificity in defining constructs makes comparisons across studies difficult. In addition, some studies quite clearly provide a program theory or conceptual framework that explains how the intervention is intended to work for students, whereas other studies leave this to the imagination of the reader. The instructional features of CTA, the curriculum unit investigated in this study, are more fully described in the next section.

WHAT KINDS OF CURRICULUM MATERIALS ARE MOST LIKELY TO SUPPORT CONCEPTUAL CHANGE?

The AAAS Project 2061 recognized the need to identify effective curriculum materials that would lead students to learn the science concepts contained in Benchmarks for Science Literacy (AAAS, 1993). Consequently, Project 2061 developed a Curriculum Analysis procedure (Kesidou & Roseman, 2002; Roseman, Kesidou, & Stern, 1996) to evaluate and rate a range of curriculum materials in K-12 science and mathematics. (AAAS, 2004, makes its reports on curriculum materials available at: http://www.project2061.org/publications/ articles/textbook/default.html.) The Project 2061 Curriculum Analysis used teams of trained raters, who are experts in science and/or science education, to first determine whether a set of curriculum materials targeted specified standards or benchmarks (i.e., was clearly focused on a "big idea"). Second, the Analysis teams used a set of 22 criteria in seven categories to identify instructional attributes of the written curriculum materials (both teacher and student guides) that would support effective student learning. The Project 2061 Curriculum Analysis inquires if a curriculum unit starts from ideas that are familiar or interesting to children; explicitly conveys a sense of purpose; takes into account student ideas and conveys suggestions for teachers to find out what their students think about phenomena related to the benchmark; provides for first-hand experiences with phenomena; and has students represent their own ideas about phenomena and practice using the acquired knowledge and skills in varied contexts (Kesidou & Roseman; Roseman, Kesidou, & Stern). These criteria seem to be at the heart of effective instruction for diverse learners (Lynch, 2000). They imply recognition of individual differences and the need to provide students with explicit goals, hands-on experiences, and teacher modeling or scaffolding of scientific reasoning.

Figure 1 contrasts the results of Project 2061's Curriculum Analysis for the CTA curriculum materials (the focus of this study) with the results for a typical middle school science text published by Prentice-Hall, Exploring Physical Science (Maton et al., 1997). On the left side of Figure 1 are the seven categories included in the Curriculum Analysis, with the associated criteria. If the print materials for a curriculum unit provide specific evidence that these instructional characteristics are present, it is given a rating, displayed as shaded circles in Figure 1. For instance, the second category of the Curriculum Analysis, Taking Account of Student Ideas, evokes the theory of conceptual change; CTA gets a good rating, whereas Exploring Physical Science does not meet this criterion.

CTA was one of the very few curriculum materials in science to have been highly rated in the Curriculum Analysis; consequently, it was chosen for SCALE-uP research. This is crucial to understanding the broader goal of SCALE-uP. Although these investigations overtly target a curriculum unit such as CTA for study, the underlying intent is to explore whether curriculum materials that have the instructional qualities identified by Project 2061 are effective. Moreover, are they effective for all subgroups of students, including students with disabilities? In a sense, then, SCALE-uP is ultimately an inquiry into the scale-up of Project 2061's Curriculum Analysis criteria through studies of curriculum materials that reflect the criteria.

RESEARCH QUESTIONS

This article reports on the second year of implementation of CTA, a conceptual change curriculum unit highly rated according to the Project 2061 instructional criteria. (For detailed information on the results of the first year's work, see Lynch, Kuipers, Pyke, & Szesze, 2005, but note that those results did not include disaggregated data for students with disabilities.) The research questions for the implementation study of CTA are:

* Does a highly rated conceptual change curriculum unit result in higher mean scores on outcome measures for students in the treatment group compared with those in the comparison group?

* Does disaggregating outcome data (based on demographic characteristics) and testing for interactions between subgroups and the curriculum conditions reveal important patterns not captured in the reporting of mean scores of treatment and comparison groups? Specifically, do students with disabilities in general education classrooms using CTA, a highly rated conceptual change curriculum unit, have higher outcome scores than students with disabilities in comparison general education classrooms?

FIGURE 1

Criterion-Level Ratings for Chemistry That Applies According to the Project 2061 Instructional Analysis

METHODS

DESIGN

As part of the larger SCALE-uP research program, this implementation study employs a comparative design intended to test for differences in outcomes between equivalent groups. The intent is to infer that any differences that are found are the result of students' different experiences with curriculum materials. Given that this public \school context could not accommodate random assignment of large numbers of individual students to different curriculum conditions for an intervention of relatively short duration (approximately 6 weeks), SCALE-uP used carefully demographically matched pairs of schools for random assignment to treatment or comparison condition. The description of the sampling procedure is included in the following. This article examines outcomes for students with disabilities and discusses the inferences that can be made about this subgroup.

POPULATION AND SAMPLE

Sampling Procedures. MCPS is a large Maryland school district (136,000 students) located in the Washington, DC, metropolitan area. MCPS consistently occupies a position among the topperforming school districts in the state of Maryland. Its student population is rapidly becoming more diverse culturally, linguistically, and socioeconomically. By 1999, the student population had no ethnic majority, and African American, Hispanic American, and Asian American subpopulations are growing at faster rates than the White subpopulation. About 13% of the middle school student population is eligible for special education services.

The population for this study is students in MCPS middle schools. However, in this study, SCALE-uP intentionally oversampled from among the most diverse middle schools in the district; demographic diversity included gender, ethnicity, and eligibility for FARMS, ESOL, and special education services. The goal was to produce two equivalent samples with enough students to provide enough power for tests of statistical significance on comparisons for disaggregated subgroups. Five pairs of diverse middle schools, carefully matched on demographic characteristics, were selected. Schools in each pair were randomly assigned to the treatment (CTA) or comparison condition. Choosing the most diverse schools somewhat reduces the ability to generalize this sample to the entire school district, but improving outcomes for diverse student subgroups was an important incentive for participating in this research for MCPS.

Students With Disabilities in the Study Sample. The centralized MCPS database codes students with disabilities according to their eligibility for special education services (Never eligible, Prior eligible, Now eligible). If eligible, other codes specify the number of hours of special education services they receive per week (Less than 15 hr, More than 15 hr). The centralized database does not include categorical data on the type of disability, information that is maintained separately in a different special education database. Nor does the centralized database contain information on the needs of students with disabilities specific to science learning or distinguish whether these students take science in self-contained classrooms or general education classrooms. This distinction is important because students with disabilities in general education classrooms receive their instruction from teachers who are licensed to teach middle school science, attached to science departments, and are expected to attend science professional development workshops provided by the district. Teachers of self-contained special education classes are special educators who in most cases are not also licensed to teach science. They affiliate with special education departments and are invited to, but not obligated, to attend professional development in science.

In order to create more sensible categories for students with disabilities for the present study, first, all students who had once been eligible for special education services but who were no longer (about 250 students) were moved into the category not eligible for special education services. These students' codes became indistinguishable from students in general education classrooms who were never eligible for special education services, for the purpose of these analyses. second, all students currently eligible for special education services were combined into one category, no matter the number of hours of service they receive. Third, a new data code was created that indicated whether students currently eligible for special education services took their science instruction in general education classrooms or self-contained special education classrooms. Fourth, students who took their science in self-contained special education classrooms (about 150 students) were eliminated from the database.

The removal of these students from the database was done for a practical reason. Given that this was an implementation study, there was no way to ascertain that students in self-contained classrooms actually received the treatment. Many teachers of students with disabilities in self-contained classroom's did not attend the professional development associated with CTA or reported that they could not implement CTA with fidelity, given the needs of their students. Some of these special educators seemed to find the content of CTA demanding for themselves or their students. Some were hesitant about the amount of laboratory preparation required or monitoring lab safety for their students. It is fair to say that teachers with little background preparation in science might find a curriculum unit with this much hands-on chemistry very challenging, unless they had day-to-day, in-school support when implementing it. (Special education teachers who found the professional development and curriculum materials useful and accessible were encouraged to remain involved with the study.) In summary, a study of a treatment like CTA for students with disabilities in self-contained classrooms would require a different approach and scope than SCALE-uP was able to provide. On the other hand, one can be confident that the present study has validity for students with disabilities in general education classrooms. Of the students with disabilities in general education classrooms (n = 277), 202 (72%) had complete data sets (pretest and posttest). Table 1 shows that 99 were in the comparison condition and 103 had the CTA treatment.

TABLE 1

Total Number of Students With Disabilities in Each Demographic Subgroup for the Comparison and Treatment Conditions

In order to establish the approximate equivalence of students with disabilities in the treatment and comparison conditions who were taking their science in general education classrooms spread across 10 middle schools, we provide several comparisons. Table 1 shows that students with disabilities in the treatment and comparison conditions were demographically similar by gender, ethnicity, FARMS, and ESOL status.

Table 2 provides summary statistics that indicate that their mean scores on four subtests of the Comprehensive Test of Basic Skills (CTBS), administered during Grade 6, were similar. Four separate one- way ANOVAs revealed no significant differences between students with disabilities in general education classrooms for treatment and comparison conditions for all four subtests, suggesting that students with disabilities in the treatment and comparison conditions had approximately equal academic abilities and achievement prior to this study (see Table 2).

In addition, the pretest scores of these two groups on the science assessment instrument developed for this study were approximately equal (see Table 3, which will be discussed in more detail in the sections that follow). In summary, the profiles of treatment and comparison subgroups of students with disabilities in general education classrooms were similar, and inferences made about differences in outcomes can be attributed with some confidence to the CTA intervention.

CHARACTERISTICS OF THE TREATMENT INTERVENTION: CHEMISTRY THAT APPLIES (CTA)

Program Theory. The set of curriculum materials chosen for this study was Chemistry That Applies (State of Michigan, 1993). CTAs design was based on conceptual change theory (A. Anderson, personal communication, January 8, 2001), and if implemented with fidelity is congruent with the type of science instruction recommended in the National Science Education Standards (National Research Council, 1996). The unit is studentcentered, hands-on, and phenomenon-based. It is a guided inquiry unit (T. Blakeslee, personal communication, October 22, 2004). Guided inquiry is characterized by a balance between student selfdirection and teacher direction that helps students to learn to collect data to answer a question, form explanations about phenomena based on the data collected, and communicate connections drawn between their explanations and scientific knowledge (National Research Council, 2000).

TABLE 2

Comprehensive Test of Basic Skills (CTBS) Mean Scores and Standard Deviations for Students With Disabilities in the Comparison and Treatment Conditions

CTA was designed for students in Grades 8 through 10 and takes approximately 6 to 10 weeks to complete. It focuses on only one challenging benchmark on conservation of matter from the Benchmarks for Science Literacy (AAAS, 1993):

No matter how substances within a closed system interact with one another, or how they combine or break apart, the total weight of the system remains the same. The idea of atoms explains the conservation of matter: If the number of atoms stays the same no matter how they are rearranged, then their total mass stays the same. (p. 79)

There are two important ideas in this benchmark. The first idea (weight remains the same in a closed system in a chemical reaction) can be learned by experiences with a variety of physical and chemical phenomena, noting changes (or lack thereof) in weight in closed and open systems, before and after chemical reactions. CTA provides such experiences. The second idea addressed by the benchmark, however, requires students to visualize how atoms, particles that cannot be seen, can explain why the weight (mass) would not change in a closed system. This is an extension of a classic conservation schema, \in a Piagetian sense (Ginsberg & Opper, 1979), and CTA provides students with opportunities to understand it by building physical models (using balls and sticks) that show how atoms are rearranged in chemical reactions.

CTAs Structure. CTA is a structured curriculum unit consisting of 24 lessons divided into four clusters. (Only three clusters were taught in this study because the fourth cluster on catalysts was beyond the scope of the MCPS eighth grade science curriculum framework.) To acquire an understanding of the conservation of matter, students explore the same four chemical reactions with increasing sophistication as the unit progresses. The chemicals chosen are materials likely to be familiar to students, such as steel wool, butane lighters, and alka seltzer. In lab groups, students experience the chemical phenomena first-hand qualitatively, then proceed to quantitative observations by weighing the systems before and after reactions occur. Eventually they create molecular models of the now familiar chemical reactions.

CTA's first cluster, Describing Chemical Reactions (three lessons), asks students to describe the changes that occur when various household substances are mixed together. The purpose is to eventually recognize chemical changes. The second cluster consists of eight lessons about Weight Changes in Chemical Reactions. Students make predictions about whether the weight will change in a series of physical and chemical reactions, in open and closed systems, and test predictions in lab activities. In the third cluster, Molecules and Atoms (six lessons), students build physical models of the atoms that rearrange in the now familiar chemical reactions, which helps them to "see" how the number of atoms are conserved as new molecules are formed.

TABLE 3

Conservation of Matter Assessment Pretest and Posttest Mean Scores and Standard Deviations for Each Demographic Subgroup in the Comparison and Treatment Condition

Lessons may take from 50 to 150 min to implement, so one lesson can stretch over several days. Students would need to do the 18 lessons in sequence to see CTA s logic, building from direct experiences with physical phenomena. CTA provides a Student Guide (to be copied from a master), which consists of 87 pages for Clusters 1 through 3. The pages are not densely printed, however, and are probably less daunting to students than a typical science textbook. The Guide provides directions to students for procedures and prompts for predictions, data collection, and interpretation. Explanations for the phenomena are embedded in the Guide rather than in a separate text, so students have rich context from which they can build understanding. The reading in the Student Guide is as much procedural as it is explanatory, and complicated vocabulary is kept to a minimum. The Guide asks students to record data, make tables, and write their interpretations. There are no worksheets provided, but some teachers in this study created worksheet-like materials for their students to use, based on the requirements of each lesson.

CTA also has a "wraparound" Teacher Guide where instructions for teachers are provided in the margins surrounding the corresponding student text for each lesson. The Teacher Guide not only provides answers to embedded questions for students and clarifies lab procedures, but it also provides pedagogical coaching to the teacher based on CTA's program theory of conceptual change. Teachers are sometimes encouraged not to provide answers prematurely to students, but rather to allow them to reason from the phenomena observed. The Teacher Guide also asks teachers to probe student thinking before an activity begins and to revisit the thinking later, consistent with conceptual change theory. The Teacher Guide provides detailed information about preparing the lab materials necessary for doing the lessons. Materials are primarily household chemicals and can be readily obtained. However, a teacher would need to be well organized in obtaining the substances in sufficient quantity and managing their use. Copies of two CTA lessons as they appear in the Teacher Guide are available on the SCALE-uP Web site at www.gwu.edu/-scale- up.

CTAs Lesson Structure. Every lesson in CTA uses the same, predictable format. For instance, Lesson 5, Gathering Evidence About Weight Experiments, begins by prompting the students to think about what they discussed in the previous lesson. Next is the Key Question, which is carefully set off from the rest of the text and has a key icon. The Key Question in Lesson 5 is, "How do your predictions compare to the actual changes in the weights of substances?" (State of Michigan, 1993, p. 21). A shaded box labeled What You Will Need provides a list of the materials required for each lesson. The procedures are in a section labeled Try This, set apart by a bubbling beaker icon. After conducting their experiments, students are directed to a pencil icon with a Think and Write label to interpret their data.

In Lesson 5, the student groups begin by designing a plan to test the predictions they made in the previous lesson. They will weigh materials and collect data on the weights before and after physical changes. The Student Guide encourages students to consider how their procedures might make their measurements more or less accurate. Groups of students carry out their procedures and collect data about weight changes for four different physical phenomena (steel wool balled up and spread apart, ice melting in a glass of water, sugar dissolving in water, and water boiling) that they observe (in the Try This section). Once data are collected, the Think and Write poses six questions for organizing and interpreting the data. The Teacher Guide suggests that the teacher "provide plenty of discussion time while students construct new ideas" (State of Michigan, 1993, p. 22). Students should discuss all steps in the lesson in their lab groups, with some wrap-up directed by the teacher for the entire class.

The Student Guide also provides an explanation for one experience, the weight change that occurred when water is boiled, as "scientists would explain" it. This scaffolds the understanding of scientifically accepted explanations for students, as it helps move them toward grasping the target idea for the unit. Lesson 5 also sets the stage for the chemical changes that students will explore in Lesson 6. Increasingly complex experiences with the same reactions help build a deeper understanding of conservation of matter as students work through the unit.

Another, perhaps more comprehensive, view of the CTA curriculum materials can be found in Figure 1. Figure 1 illustrates the instructional features that were located in the Student and Teacher Guides for CTA. A close look at Project 2061's criteria for instructional strategies contained in curriculum materials shows that there are many more important features of CTA that were not captured in the descriptions previously mentioned. Figure 1 also shows that a traditional science textbook does not appear to support as many effective instructional strategies as CTA.

FIDELITY OF IMPLEMENTATION

MCPS has a strong middle school science administrative structure, with department chairs responsible for teachers' follow-though on administrative guidance. Teachers in the treatment schools were directed by MCPS science supervisors to implement CTA with fidelity for this implementation study. Each teacher implementing CTA was visited by trained observers from the MCPS evaluation staff and by SCALE-uP researchers for at least one lesson. The observers used an instrument developed by SCALE-uP that prompted diem to look for and rate the frequency of 20 characteristics of implementation derived from the Project 2061 Curriculum Analysis during a lesson's enactment in treatment classrooms. In addition, SCALE-uP researchers met with implementing teachers midway through the unit and after its completion to discuss various aspects of the unit. Teachers demonstrated sound knowledge of CTA. This substantiates that CTA was implemented by all teachers in the treatment group with levels of fidelity that ranged from acceptable to high.

CHARACTERISTICS OF CURRICULUM MATERIALS IN THE COMPARISON CONDITION

In the present study, schools with students who received CTA were matched with five comparison schools whose students did not receive CTA. Instead, teachers implemented curriculum materials from an extensive menu of approved options. These materials ranged from traditional eighth grade science textbooks from companies such as Prentice-Hall, to NSF-funded, reform-based, innovative curriculum units on chemistry. Teachers in MCPS middle schools tend to use combinations of materials, rather than relying on a single alternative resource. The school district does not, however, keep records of which materials are used, except for monitoring the approved list. MCPS had a Local Systemic Change grant focused on middle school science prior to SCALEuP that exposed teachers to reform-based instructional practices and curriculum materials. However, none of these materials had been evaluated with experimental designs, so their effectiveness was unknown. Professional development meeting minutes indicated diat a wide variety of materials were in use, but no one set of curriculum materials was provided to all middle schools with the exception of a textbook series by Prentice-Hall.

All eighth grade science teachers, both treatment and comparison, were required to follow the State of Maryland's curriculum framework and indicators, as well as local standards. They taught the chemistry unit during the same quarter and routinely attended professional development meetings on reform-based instructional strategies. However, for comparison teachers, there was no way of ascertaining the curriculum materials used during this study, beyond saying that they were not using CTA (s\ee Lynch, Szesze, Pyke, & Kuipers, in press, and Lynch, 2006, for a more extensive discussion).

THE CONSERVATION OF MATTER ASSESSMENT (COMA)

Development of the Instrument. Student understanding of the target benchmark was measured by outcomes on the Conservation of Matter Assessment (COMA). A new assessment development procedure was established in collaboration with AAAS Project 2061 to design COMA and ascertain its validity (AAAS, 2003a; AAAS, 2003b; Stern & Ahlgren, 2002). The procedure relied on an assessment development team that included science content experts, science educators, science teachers, and assessment specialists. The team developed a cohesive set of assessment items that focus on the conservation of matter benchmark. The items use language and illustrations that allow the assessment to be read and understood by a maximum number of students (Pyke & Ochsendorf, 2004).

COMA was constructed so that students taking the assessment think about four different physical/chemical phenomena as they respond to multiple probes of their knowledge of the conservation of matter. The phenomena encountered on the assessment are all closely related to the target benchmark but are designed to be independent of curriculum materials. (COMA was not directly linked to lessons in CTA or comparison curriculum materials.) For example, students are asked to consider what happens when someone spills perfume and it turns from a liquid into a gas: Is the number of atoms more, less, or the same when the perfume changes from liquid to gas? If the person could collect the atoms after they turned into a gas, would the total mass be more, less, or the same as it was before the perfume was spilled? This example asks students to select one answer from possible alternatives that will indicate whether or not they understand that matter is conserved, and that atoms explain the conservation. To further gauge students' understanding of the benchmark, COMA also includes items that ask students to construct their own written response explaining why they selected a particular answer. COMA has six selected response items and four constructed response items for student explanation.

Scoring Assessment Data. During the piloting of COMA, the assessment development team used pilot data to determine categories for scoring student responses that indicate scientifically appropriate understanding or alternative conceptions. Next, a small group of science educators, science teachers, and chemistry graduate students were trained to use the categories to rate all the assessments. Approximately 2% of the assessments were scored by multiple raters for reliability purposes, and across all pairs of raters, inter-rater reliability Kappa scores ranged from .70 to .81. A Kappa greater than .7 is considered satisfactory (Howell, 1997).

An expert panel including science educators and an assessment expert determined the weighting scheme for combining the 10 items into a single score. According to the weighting scheme, students' understanding that matter is conserved contributes approximately 60% of the total score, and their understanding that atoms explain this phenomenon contributes approximately 40% of the total score. The weighting scheme balances the contribution of student-selected and studentgenerated items in the total score, transforming the items into a range of scores from 0 to 100. The expert panel that developed the weighting scheme analyzed information from sample student responses, and set cut scores to distinguish four levels of student understanding of the conservation of matter benchmark:

* Scores of 0 to 23 indicate no consistent evidence of understanding the benchmark ideas.

* Scores of 24 to 50 indicate some evidence of understanding in specific contexts.

* Scores of 51 to 70 indicate some fluency with the ideas, but also misconceptions in certain contexts.

* Scores of 71 to 100 indicate a flexible understanding of, and commitment to, the benchmark ideas, with few errors or misconceptions.

When the weighting scheme is applied, the 10 items in COMA reliably target the conservation of matter benchmark, with Cronbach's alpha = 0.85 for the entire sample (N = 2,282). Cronbach's alpha for the sample of students with disabilities (n = 202) was also found to be satisfactory (α = 0.77).

PROCEDURES

Once the study sample was identified and schools were randomly assigned to either the treatment or comparison condition, eighth grade science teachers from all treatment schools were asked to participate in 2 days of professional development focusing on CTAs unit rationale and laboratory exercises. Teachers were provided with boxes of laboratory materials and print materials to ease implementation of the unit. They also attended follow-up meetings during the school year. Although teachers could not be required to attend these meetings, participation rates were good for general education science teachers. Teachers of comparison group students attended professional development meetings normally scheduled for middle school science teachers. Eighth grade teachers in all 10 of the study schools (both treatment and comparison) were asked to teach the same curriculum standards during the same quarter, and their students were assessed using the same assessment instrument (COMA).

ANALYSIS

Analysis of covariance (ANCOVA), with the student as the unit of statistical analysis, was used to generate statistics to address the research questions. Because schools are the sampling unit and students are the unit of statistical analysis for the study, a nesting of students within classrooms by teacher and school was created. However, ANCOVA was selected for data analysis because of the nature of the research questions, the naturalistic quasi- experimental design, and the robustness and familiarity of analysis of variance statistics. Admittedly, systematic variation at the teacher and school levels exclusively within the treatment or comparison conditions may confound the findings, and a hierarchical modeling approach to analysis of these data might provide a more precise estimate of effect sizes. However, there was considerable effort taken to educate teachers on the nature of the study and encourage practices and procedures that would reduce the risk of confounding nested effects while allowing the natural interaction of students, teachers, staff, and curriculum materials that affect results.

Research question 1 required a one-way ANCOVA and research question 2 required five twoway ANCOVAs, all with COMA pretest scores as the covariate. Assumptions for these analyses were evaluated using the guideline established in the ANCOVA literature (cf., Tabachnick & Fidell, 1989). The data did not violate any assumptions of normality, linearity, multicolinearity, or singularity, and homogeneity of variance was found to be acceptable. When significant main effects and interactions were found, exploratory follow-up contrasts were conducted to explain the effects. All ANCOVA analyses were performed using SPSS for Windows, Version 12. Effect sizes were calculated for each subgroup for all dependent variables by subtracting the adjusted mean outcome score for the comparison condition from the adjusted mean outcome score for the treatment condition and dividing the difference by the study sample standard deviation (Cohen, 1988).

In addition to two-way ANCOVA analyses, gain scores for students with disabilities were compared to gain scores for students without identified disabilities in the treatment and comparison conditions. The average number of student misconceptions (as identified by the categories used by trained raters) at pre- and posttest was also computed for students with disabilities in the treatment and comparison conditions. Gain scores and misconceptions data are included to more fully describe the effects of the treatment curriculum materials for students with disabilities. Inferential statistics were not used in these analyses.

RESULTS

SUMMARY OF OVERALL AND DISAGGREGATED RESULTS

Pretest COMA Scores. For the entire sample of eighth grade students participating in the study (N= 2,282), a one-way ANOVA indicated no statistically significant differences on the COMA pretest scores between the CTA and comparison conditions. Five one- way ANOVAs revealed significant differences in mean scores among subgroups of four of the five demographic categories: ethnicity, eligibility for FARMS, eligibility for ESOL services, and eligibility for special education services. The exception was the fifth category, gender, where there were no significant differences in mean scores between males and females. For ethnicity, the mean for the White subgroup was significantly higher than the mean for all other subgroups, and the mean for the Asian American subgroup was significantly higher than the mean for the African American and Hispanic subgroups. For FARMS, the mean for the subgroup that had never been eligible for FARMS was significantly higher than the means for the subgroups that had been eligible for FARMS prior to the study (but were no longer eligible) and that were eligible for FARMS during the study. When eligibility for ESOL services was considered, the mean for the subgroup that had never been eligible for ESOL services was significantly higher than the means for the subgroups that had been eligible for ESOL services prior to the study (but were no longer eligible) and that were receiving ESOL services during the study. Finally, the pretest mean for students with no identified disabilities was significantly higher than the pretest mean for students with disabilities (i.e., eligible for special education services).

Posttest COMA Scores. A one-way ANCOVA indicated a significant difference in the COMA posttest scores between the CTA and comparison conditions, with F(1, 2282) = 44.19, p < .01, Cohen's d= .25. The covariate was the study sample p\retest mean (M = 30.06). The adjusted mean score for the CTA condition (M = 50.20) was significantly higher than the adjusted mean score for the comparison condition (M= 42.71).

In order to determine whether CTA had different impact among levels within demographic subgroups of students, five two-way ANCOVAs were conducted to test for treatment interactions within each of the five demographic categories, with pretest COMA scores as the covariate. Table 3 provides subgroup size, pretest scores and standard deviations, unadjusted posttest scores and standard deviations, and adjusted posttest scores for all demographic subgroups. Table 3 also includes the F statistics for tests of the interaction effects between each demographic variable and experimental condition (presented as F) and the F statistics for the effect of CTA for each demographic subgroup (presented as F^sub B^). Finally, Table 3 includes effect sizes for CTA.

An examination of Table 3 shows that effect sizes were in the small to medium range favoring the CTA condition. The effect of CTA was significant for all demographic subgroups except the Asian American subgroup, where there was no significant difference. Although there were positive treatment effects for CTA, the main effects for ethnicity, eligibility for ESOL services, and status as a student with disabilities that existed at the pretest were maintained at the posttest. In other words, students in the CTA condition gained more than their peers in the comparison condition, but differences between subgroups at pretest were reflected in the pattern of scores at posttest. However, it is important to note that subgroups with the greatest effect sizes are among those most often underserved in science education, that is, Hispanic (Cohen's d = .34), Prior ESOL (Cohen's d = .41), and Now ESOL (Cohen's d = .38).

Moreover, there was one significant interaction: Two-way ANCOVA indicated a significant interaction between experimental condition and FARMS, with F(2, 2244) = 3.41, p< .05. This interaction is characterized by a significantly greater mean score for the Prior FARMS subgroup than for the Now FARMS subgroup in the comparison condition, and no significant difference in the mean's between these two subgroups in the treatment condition. CTA helps students currently eligible for FARMS to maintain their standing relative to peers who are no longer eligible for FARMS. In the comparison condition, scores of these two groups are significantly different at posttest, indicating that achievement gaps widened.

RESULTS FOR STUDENTS WITH DISABILITIES

Posttest COMA Scores. Two-way ANCOVA (treatment X eligibility for special education services) indicated no statistically significant interaction between experimental condition and eligibility for special education services, suggesting that CTA is as effective for students with disabilities as it is for students without identified disabilities (see Table 3). The adjusted mean score for students with disabilities in the CTA condition (M = 40.68) was higher than the mean score for Students with disabilities in the comparison condition [M = 32.99). The main effect for eligibility for special education services that was present at the pretest is aluo present at the posttest, with F(1, 2239) = 29.23, p < .01. However, the effect of CTA was significant for all students, regardless of special education status. Cohen's d for students with disabilities is .26, similar to the effect size for students without identified disabilities (Cohen's d = .25) and for the entire sample (Cohen's d =.25).

Descriptive Follow-Up Analyses for Students With Disabilities. Figure 2 displays the unadjusted pre- and posttest mean scores for students with disabilities and students with no identified disabilities in the CTA and comparison conditions (see also Table 3). Visual inspection of Figure 2 indicates that the unadjusted pretest mean scores for students with disabilities in the treatment and comparison conditions were about the same, but about nine points lower than those of students with no identified disabilities. The gain in mean score from pretest to posttest for students with disabilities in the CTA condition was 15.09 points, whereas the gain in mean score for comparison students with disabilities was only 6.76 points. Note that for students with no identified disabilities, the gain in mean score for CTA was 21.75 points, but the gain in mean score for thcjr peers in the comparison condition was only 13.72 points. It is interesting to note that, on average, students with disabilities in the treatment condition appear to gain as much from pretest to posttest as students with no identified disabilities in the comparison condition.

FIGURE 2

Pretest and Posttest Means for Students With Disabilities and Students With No Identified Disabilities in the Treatment and Comparison Conditions

Misconceptions. Student responses were not merely coded as correct or incorrect but divided into four categories: correct response, incorrect response indicating a misconception, blank response, or unintelligible response. Figure 3 shows diat at pretest, the patterns of correct responses, misconceptions, and blank or unintelligible responses were about the same for students with disabilities in the treatment and comparison conditions. Figure 4 shows that at posttest, students with disabilities in the CTA condition had fewer misconceptions and more correct responses compared with their peers in the comparison condition.

DISCUSSION

In this study, SCALE-uP set about exploring whether the CTA curriculum unit was effective overall, and when data were disaggregated, with special focus on results for students with disabilities. During the first year that CTA was implemented (prior to the present study), the unit proved to be significantly better than the comparison condition. It was effective when data were disaggregated by gender, ethnicity, and eligibility for FARMS and ESOL services (Lynch et al., 2005). Students with disabilities were not included in the first year analyses.

The pattern of data for various subgroups for both the 1st and 2nd years of implementation indicated that subgroups of students entered a unit of study on conservation of matter with significantly different pretest scores (except for gender). Then, presented with the CTA curriculum materials, subgroups of students learned more about the target idea than did their matched peer comparison subgroup. Although the CTA curriculum unit did not actually close achievement gaps, some of the largest effect sizes favoring CTA were shown for students often underserved in science education, including students eligible for FARMS, students learning English, and Hispanic students.

FIGURE 3

Mean Number of COMA Pretest Responses Coded as Correct, Misconception, Incomprehensible, or No Evidence for Students With Disabilities

In the present study, a replication, we were also able to include students with disabilities in general education science classrooms in the analyses. Disaggregating the data for students with disabilities in this quasi-experimental study proved to be difficult because the organization of the school district's central database was inadequate for describing students with disabilities by specific type of disability. Rather, the database coded students by indicating their current eligibility for special education services (Never, Prior, or Now) and number of hours of services received if eligible (More or Less Than 15 hr). It was also a challenge to determine which students with disabilities received the CTA treatment or comparison condition because students with disabilities in self-contained classrooms did not always use the same curriculum materials as those in general education classrooms. In sum, the database initially had no codes that indicated students' cognitive ability in science, where such students received their science instruction (general or special education classrooms), or the preparedness of their teachers to deliver the CTA unit. Consequently, SCALE-uP had to create a new code: students currently eligible for disability services who take science instruction in general education classrooms. This coding procedure allowed the identification of students with disabilities who received the same access to curriculum materials and instruction by certified science teachers as students with no identified disabilities in this study.

FIGURE 4

Mean Number of COMA Posttest Responses Coded as Correct, Misconception, Incomprehensible, or No Evidence for Students With Disabilities

Based on our review of the literature, we anticipated that a model of science learning, conceptual change theory, would be as appropriate for students with disabilities as for other students. Aldiough it may be taken as a given that some students with disabilities have substantial difficulties with science, there is no reason to think that the way this group of students learns science is qualitatively different than the way any other students learn science. Thus, we expected that a curriculum unit (CTA) built on conceptual change theory and employing guided inquiry would work well for students with disabilities.

Results showed that the effect sizes for students with disabilities who used CTA were basically the same as those obtained for the entire aggregate group, namely, small to medium effect sizes. This is encouraging, but not as encouraging as the results for other groups of students underserved in science education, such as current FARMS or ESOL students, for whom medium effect sizes were found. Still, these results add some support to the claims that instruction that includes guided inquiry, hands-on science, and students working in heterogeneous lab groups can be appropriate and effective for students with disabilities. Frankly, in thinking hard about the concepts that must be understood in grappling with the conserv\ation of matter benchmark, it is difficult to conceive of how anyone could make sense of this "big idea" without direct experiences with physical phenomena that are structured and scaffolded, making the case that matter is conserved in chemical reactions. Such experiences, however, may be necessary but not sufficient. Conceptual change also involves the building of mental models of invisible atoms and molecules and the fundamental ability to conserve, in the Piagetan sense.

Although CTA is promising for eighth grade students with disabilities who are in general education classrooms, it should also be pointed out that such students did not rise to very high levels of understanding on the outcome measure, COMA. However, the same may be said for students with no identified disabilities. On average, most students in this implementation study did not master the conservation of matter benchmark, but rather scored in the low to mid ranges of understanding on COMA. This point should not be lost in the era of state curriculum frameworks, indicators, and accountability. Science standards/benchmarks slated for teaching and learning by a certain grade level may prove elusive to many students. Perhaps such concepts are too difficult to learn at a certain age level. We have a different view, however. Overall low levels of understanding of a hard science concept like conservation of matter may indicate that it requires a substantial amount of time to learn, well-sequenced instruction with adequate scaffolding, and engaging experiences in science labs that make the concept accessible to students. Conceptual change does not occur for many students all at once, but over time and in multiple contexts. Some students with disabilities may need even more time, but others may do well in the kind of instructional environment fostered by CTA. The study contributes to research that shows that students with disabilities can benefit from this instructional approach (see also Dalton et al., 1997; Palincsar et al., 2001).

There are limitations to this study. The study sample included the MCPS middle schools that were most diverse, so the results may not be applicable to schools that were more ethnically and linguistically homogeneous, or that have students from highly affluent families. We were not able to report on the specific type of disabilities for students in the sample, because of perceived restrictions related to Institutional Review Board (IRB) matters. We cannot explicitly describe the curriculum materials being used in about 100 Comparison classrooms, other than to say they were from a list of approved options that contained a range of "traditional" and "nontraditional" materials. However, the size of the sample and the consistency of the data reported here alongside of the data from the 1st year implementation allow some confidence in our interpretations of results.

This study provides support for the effectiveness of one highly rated curriculum unit, CTA, for students with disabilities and for all other subgroups of students included in this implementation study. Consequently, CTA was scaled-up to students in 37 MCPS middle schools in 2003. In addition, SCALE-uP research provides a model for the careful evaluation of science curriculum materials in a large and diverse school district. The disaggregated data and attention to effects for various subgroups of students, combined with the features of the CTA intervention exposed by the Project 2061 Curriculum Analysis, allow a revealing view of the possibilities of improving outcomes in science for all, including students with disabilities.

IMPLICATIONS FOR PRACTICE

WELL-DESIGNED CURRICULUM MATERIALS

CTA is an example of a well-designed curriculum unit according to the Project 2061 criteria, and also appears to be consistent with instructional principles recommended for students with disabilities (cf., Grossen, Carnine, Romance, & Vitale, 2002; Kame'enui, Carnine, Dixon, Simmons, & Coyne, 2002). CTA is centered on one important big idea in science, conservation of matter. The entire focus of this unit is on this big idea. In addition, CTA's structure corresponds closely by what is meant by judicious review. Each cluster of lessons keeps circling back to four chemical reactions at increasing levels of complexity for application of skills and understanding. CTA does not use the term "review" but the spirit of this principle is captured in the carefully planned structure of CTA. In a similar vein, CTA is a good example of a curriculum unit with strategic integration. The concepts and understandings that CTA helps students build are woven carefully and sequentially into the fabric of the unit. This allowed for continuous opportunities for students to apply the concept.

Designing instructional units that reach a range of diverse learners, from students with no identified disabilities to students with one or more disabilities that are addressed through special education services, is a challenging endeavor. Too often, science instruction for students with disabilities is focused on worksheets and textbooks and held in classrooms that are not conducive to hands- on inquiry (Lynch, 2000). Teachers need to become savvy evaluators of curriculum materials. The Project 2061 Curriculum Analysis criteria or basic instructional design principles effective for diverse learners may guide decisions on promising materials to use, when an evidentiary base is absent (Kame'enui et al., 2002).

GUIDED INQUIRY

The guided inquiry used in CTA is a viable instructional strategy for students with disabilities. There is much debate about the best way to approach science education for students with disabilities and the amount of structure and guidance that should be supplied. This article will not resolve the problem, but we offer this research as proof that guided inquiry can be effective in teaching students with disabilities challenging science concepts. The CTA unit carefully structures and sequences the phenomena that students experience in each lesson, leading to an emerging understanding of conservation of matter. The lesso


Source: Exceptional Children

More News in this Category


Related Articles



Rating: 3.3 / 5 (8 votes)
Rate this article:
1/52/53/54/55/5

User Comments (0)

Comment on this article

Your Name
Text from the image
Comment
max 1200 chars
* All fields are required