description- – An earlier Digest described the shortcomings of three methods commonly used to summarize changes in test scores. This Digest describes two less commonly used approaches for examining changes in test scores, those of Standardized Growth Estimates and Effect Sizes. Aspects of these two approaches are combined and applied to the Iowa Test of Basic Skills to demonstrate the usefulness of a third method, termed Expected Growth Size, to examine change in test scores. An expected growth size is more difficult to calculate than the other methods, but it offers three advantages. By expressing change in relation to the standard deviation, growth rates for different tests and different grade levels can be compared directly. Once expected growth sizes are calculated for a given test, they can be transformed easily to more common measurement scales. And once expected growth sizes are transformed to a Normal Curve Equivalent scale, changes in an individual's or a group's mean score can be reported in relation to expected growth. How to calculate expected growth size is illustrated. (SLD)
subjectcollectiondatepublishercreator description- – This Digest introduces the advantages and disadvantages of three commonly used methods of reporting test score changes: (1) change in percentile rank; (2) scale or raw score change; and (3) percent change. The change in percentile rank method focuses on the increase or decrease of the mean percentile ranking for a group of students. This method has two main problems. The first is that calculating the mean percentile rank based on an individual's percentile ranks can provide an inaccurate estimate of a group's mean performance. The second is that, because of unequal intervals separating percentile ranks, changes in percentile ranks represent different amounts of growth at each point on the scale. A second method is scale or raw score change. The main drawback to this methods is that when raw scores are used to determine change, it is difficult to compare change across tests with different score ranges. A third approach, that of reporting percent change, causes further distortion. resulting in a statistic that is difficult to interpret and misleading. All of these methods should be avoided when summarizing change in test scores. A separate Digest suggests better ways to summarize changes. (SLD)
subjectcollectiondatepublishercreator description- – The goal of this study was to identify the effects of state-level standards-based reform on teaching and learning, paying particular attention to the state test and associated stakes. On-site interviews were conducted with 360 educators (elementary, middle, and high school teachers) in 3 states (120 in each state) attaching different stakes to the test results. In Kansas, state test results were used to determine school accreditation but had no stakes for students. In Michigan, school accreditation was determined by student participation in and performance on the state test and students received an endorsed diploma and were eligible for college tuition credit if they scored above a certain level on the 11th grade tests. In Massachusetts, school ratings were based on the percentage of students in different performance categories and students, starting in 2003, had to pass the 10th grade test to graduate. No clear relationship was found between the level of the stakes attached to the state test and the influence of the state standards on classroom practice. Findings suggest that other factors are at least as important, if not more so, in terms of encouraging educators to align classroom curricula with these standards. At the same time, as the stakes attached to the test results increased, the test seemed to become the medium through which the standards were interpreted. Taken together, findings suggest that stakes are a powerful level for effecting change, but one whose effects are uncertain. A one-size-fits-all model of standards, tests, and accountability in not likely to bring about the greatest motivation and learning for all students. Three appendixes contain a grid describing state testing programs, the interview protocol, and the methodology.
subjectcollectiondatepublishercreator