High-stakes testing was a part of education before No Child Left Behind and the Common Core State Standards. The selection process for gifted and talented education, special education, secondary schooling, and post-secondary schooling has always included a high-stakes test. Selection could be based on inheritance, chance, purchase, or achievement. All democracies have deemed achievement or merit as the just way (Heyneman, 2004). Talent has thus been chosen by secondary and post-secondary schools as the entrance criteria so that access is just. Four ways exist to assess talent: past academic achievement, oral examinations, written examinations, and essay examinations. Although academic achievement is the best predictor of university completion, ministries of education have little control over individual teachers’ evaluative standards, and the larger the country, the larger the discrepancies can be between schools in regards to grade inflation or rigor of course work. Therefore, most countries have chosen the standardized examination route (Heyneman, 1987).
Selection for university entrance is necessary (a) to reward the right skills and (b) to decrease biases and corruption. According to Harris and Herrington (2006), achievement “should be based on merit” and the school system should be “the great equalizer” (p. 209). It is important that nations look at these two areas when choosing which type of entrance examination to give.
The best tests to show both talent and to rule out bias due to cultural background or English language learning have proven to be teacher-created essay examinations (Heyneman, 1987). The con is that essay tests are costly in time, human resources, and money to administer. On the other hand, they match high level classroom discussion more closely than multiple-choice tests. Small and wealthy nations can afford to administer the accountability mechanisms needed to keep out corruption in a subjective format test. Teachers find school-based examinations popular because they are better able to measure what external standardized tests cannot, such as motivation and diligence, which determine graduation rates in higher education more effectively than aptitude (Heyneman, 1987). Because of the size and diversity in the U.S., the most common high-stakes tests are standardized multiple choice tests.
Unintended consequences of standardized high-stakes tests have caused controversy in the U.S. surrounding their use. Au (2007/2013) conducted a qualitative metasynthesis of 49 studies about high-stakes testing. The study found that high-stakes tests led to a narrowing of the curriculum, fragmented knowledge, and teacher-centered pedagogy. Au states, “As teachers negotiate high-stakes testing educational environments, the tests have the predominant effect of narrowing curricular content to those subjects included in the tests, resulting in the increased fragmentation of knowledge forms into bits and pieces learned for the sake of the tests themselves, and compelling teachers to use more lecture-based, teacher-centered pedagogies” (2007/2013, p. 246). Sleeter and Stillman (2005/2013) studied mandated curriculum standards. They also found a narrowing of the curriculum based on the standards and teaching independent skills in a fragmented manner. The findings concerned Sleeter and Stillman because multicultural education practices call for real-life teaching experiences that connect skill learning to real life situations (rather than fragmented skill and drill) and students and teachers making the decisions about what students should learn (rather than someone outside of the classroom such as test-makers or standards-creators).
Siskin (2003/2013) shows an interesting example of how a curriculum is changed once part of the high-stakes testing environment. Music class, once a model for what education should look like, became a paper and pencil worksheet class. Because what is tested is what is valued in schools, music teachers rallied to be tested. The music teachers were afraid of their programs being cut and wanted the tests to show how valuable their programs were. Before the tests, music classes typically had high expectations for all students, real-life assessments in the form of concerts, and accountability to the parents and the community through these performances. However, after the tests were implemented, teachers found themselves teaching the content of the test the way the test was administered. They started giving paper and pencil worksheets and they felt accountable for the number scores rather than for the community performances.
Further controversy surrounds the rewards and the sanctions for teachers and schools as a result of standardized high-stakes test scores. Eisner (2001/2013) points to the extrinsic motivation such an emphasis on high-stakes testing promotes. Students become “reward junkies” according to Eisner. The culture is one of learning for the test and not learning for learning sake. As educators, we know that intrinsic motivation not extrinsic motivation is the hope for our students so that they will be life-long learners. In addition to rewards, sanctions affect students and teachers also. Sanctions can induce stress for teachers (Ravitch, 2010) and students. As Ravitch stated on The Daily Show, poverty is a predictor of academic achievement. Those who need the resources are punished by not getting resources while those who do not need resources are rewarded with more resources. This creates a Matthew Effect, where the rich get richer and the poor get poorer.
Value added measures have been one attempt to control for student and school characteristics when analyzing standardized test scores, but the value added measures are not a predictable statistical model. Ravitch states, “With all their caveats and flaws—would drown out every other measure so I concluded that value-added assessment should not be used at all” (2010). Using value added measures to assess teacher effectiveness, teachers who are highly effective one year may be rated ineffective the next (Ravitch, 2010). SAS is responsible for the value added measurement of teachers in NC. It is hard to evaluate their algorithm because they keep their formula a secret.
Ravitch (2010) believes, “Our most important public institution is under siege by people who want to privatize it, turn it into profit centers, and treat children as data points on a chart.” That was not the original intention of the accountability movement. The intention of standardized testing was to get an objective look at a students’ achievement. The intention of mandated standards was to set high expectations for all of our students. The intentions were good, but they aren’t working. Eisner (2001/2013) points out the biggest problem. We don’t have a better alternative yet, but researchers are working on that.
Au, W. (2007/2013). High-stakes testing and curriculum control: A qualitative metasynthesis. In D. J.Flinders & S, J. Thornton (Eds.), The curriculum studies reader, (pp.235-251).
Baker, E., Barton, P., Darling-Hammond, L., Haertel, E., Ladd, H., Linn, R., Shavelson, R., & Shepard, L. Economic Policy Institute, (2010). Epi briefing paper: Problems with the use of student test scores to evaluate teachers (278). Retrieved from Economic Policy Institute website: https://moodle1314-courses.wolfware.ncsu.edu/pluginfile.php/391680/mod_resource/content/1/Probs_w_ use_of_student_test_scores_to_evaluate_teachers.pdf
Eisner, E. W. (2001/2013). What does it mean to say a school is doing well? In D. J. Flinders &S. J. Thornton (Eds.), The curriculum studies reader, (pp.279-287).
Harris, D. N., & Herrington, C. D. (2006). Accountability, standards, and the growing achievement gap: Lessons from the past half-century. American Journal of Education, 112(2), 209-238. Retrieved from http://www.jstor.org/discover/10.1086/498995?uid=3739776&uid=2&uid=4&uid=3739256&sid=21103046056843
Heyneman, S. P. (2004). Education and Corruption. International Journal of Educational Development, 24(6), 637-648. Retrieved from http://www.vanderbilt.edu/peabody/heyneman/PUBLICATIONS/Education%20%26%20Corruption.pdf.
Ravitch, D. (2010). Retrieved from http://voices.washingtonpost.com/answer-sheet/diane-ravitch/ravitch-why-teachers-should-ne.html.
Siskin, L. (2003/2013). Outside the core: Accountability in tested and untested subjects. In D. J. Flinders & S, J. Thornton (Eds.), The curriculum studies reader, (pp. 269-278).
Sleeter, C. & Stillman, J. (2005/2013). Standardizing knowledge in a multicultural society. In D. J. Flinders & S. J. Thornton (Eds.), The curriculum studies reader, (pp. 252-268).