Nonpartisan Education Review / Essays: Volume 7, Number 2
Access this essay in .pdf format
Using Student Achievement Tests to Evaluate Teachers—A Very Bad Idea
Dr. Tom Haladyna*
Recent news reports of suspicious achievement test scores in our public schools point to the failed practice of using a single student achievement test given once a year to determine the effectiveness of our teachers. This practice is patently invalid. The American Educational Research Association, the largest organization in the world dedicated to research in education, has long established standards that argue against one-time testing. If we want to evaluate student learning or a teacher’s effectiveness, we need many, complementary, and valid indicators of student learning. A single test is insufficient for this purpose.
Each state has identified or developed a test that measures its desired curriculum. We refer to this test content as content standards. Most state tests are developed to measure content standards. Most states contract with test companies. These tests meet national test standards for quality and validity. These state tests generally survey content taught at each grade level. A comprehensive test of student learning at any grade level would entail hundreds of items and several days of testing. Because these student achievement tests are generally very short, none of these tests is particularly useful for detailed diagnosis of strengths and weaknesses in the student achievement. Therefore, the problem is not with design of the test. The fault is what we do with test results. We try to use test scores to evaluate teachers. The pejorative term we use for this practice is test-based accountability.
The first problem with this practice is the accountability system for student achievement. In many states, the teacher stands alone. Why are teachers solely responsible for student learning when we know that many other factors heavily influence how well students learn? An honest accountability system would incorporate all relevant factors that determine student achievement. This accountability system should include students, parents, teachers, school leaders in the school district including the superintendent, local and state school boards, the governor, the legislature, and Federal government and Congress. We need a shared responsibility for student learning. Local school boards, our state, and the Federal government provide resources and often dictate policies and even curriculum. Typically, all of these parties avoid responsibility for student outcomes and continue to berate teachers to teach better and harder, as if they were the only ones that mattered. Few teachers in our nation would claim that they have sufficient resources to teach as effectively as they can. For example, Arizona has the fewest resources available to teachers and the very low achievement scores. Research continues to show that family and home influence is one of the most powerful factors in student achievement. Accountability needs to be broadened to include all parties responsible and influential for student learning, not just the teacher.
The second problem is the composition of students in any state and its schools. We have two groups of students–one high achieving and the other very low achieving. Students at-risk are typically very low achievers. What do we know about these at-risk students? Poverty is one risk factor. Those who are learning to speak English possess another risk factor. Having a disability or several disabilities is another risk factor. Living in cultural isolation from mainstream American is another risk factor. The more risk in a student’s profile, the more likely that the student’s achievement will be lower. We also know that beginning teachers typically do not make at-risk schools their first choice when applying for a teaching position. If you are a highly effective teacher, you want to teach the brightest students, who also have highly supportive parents. Because the wealthier school districts have more resources, highly effective teachers will gravitate to these school districts for higher pay, better benefits, and children who want to learn. Those of us who have taught at-risk students know the challenge. Most students are not prepared to learn. We offer a variety of important social services for these students, including breakfast and lunch, medical treatment, clothing, counseling for mental health problems, and after-school programs. We struggle mightily to motivate and teach these students. Parental support is often absent. Support from administration, school board, or the state is insufficient. We know from learning theory and research that the key to more effective learning is engaging students in learning. Time in school must be well spent. These students need smaller classes, longer school days, longer school years, tutors, and individual learning plans if we want to see real gains in their learning. However, it will come slowly.
A third problem is a matter of logistics. How do you measure a moving target? A classroom is not a stable unit of measurement. Students come and go. Some classes have over a two-thirds turnover during a school year. Teachers sometimes swap classes to teach in their stronger area. A general survey test is not going to measure a teacher’s effectiveness accurately when the class composition changes. In addition, the achievement growth of at-risk classes is usually very low. Consequently, teachers of at-risk students are often labeled unfairly as failing teachers, not because of their teaching but as a result of the composition of the classes assigned to them. Being a teacher of at-risk students puts the teacher at-risk of keeping a teaching job. Imagine coaching a baseball team with the worst players, and being fired because they lose games.
A fourth problem that has existed for a long time is when you use one test to evaluate learning and teachers, teachers and other school leaders cheat. In a state-funded study in 1989, we found about 10% of Arizona teachers confessing to cheating on the state’s test–at a time when these tests were not used to evaluate teachers. The incidence of cheating on these tests nationally is at epidemic levels. If you are a teacher, who is badgered into increasing test scores for pay bonuses or keeping your job, you might do anything to get a high score, even cheat. By the way, it is very easy to cheat on these tests without being detected. Teaching to the test is a dishonest way to boost scores, because the test results are biased. The test is a sample of what is to be taught and learned, not everything that is taught and learned. When a teacher teaches to the test, only a fraction of what content is to be learned is actually taught. Students are cheated of the opportunity to learn what they were supposed to learn. And the test results paint a false picture of what was really learned–a fraction of the content. We also have other methods of cheating that are very subtle. Nonetheless, state after state and city after city report scandals on cheating. It is the one-test approach that teachers find appallingly unfair. Unusual test results should be closely monitored and validated. We do not need cheaters employed in our schools to create the illusion that teaching and learning is effective.
A fifth problem is that these state tests were neither designed nor validated to measure teacher effectiveness. We have extensive programs to validate a professional person’s competence in every profession, but none of these programs use a single test designed and validated for some other purpose. The remedy for evaluating teachers is to collect a variety of indicators of teaching effectiveness. Look for agreement among indicators of effective teaching. If teaching is low quality, than an improvement plan is written and followed with consequences for failing to improve. Most new teachers who are not well suited for teaching self-elect to quit teaching.
We have a long road ahead in all states to improve schools and education for our students. The one-shot achievement test is not a solution. Who among us wants to be judged this way?
Citation: Haladyna, T. (2011). Using student achievement tests to evaluate teachers—A very bad idea. Nonpartisan Education Review / Essays, 7(2). Retrieved [date] from
Access this essay in .pdf format
* Professor Emeritus, Arizona State University, Phoenix, Arizona, tmh [at] asu [dot] edu