The Immorality of Tests

This work is available here free, so that those who cannot afford it can still have access to it, and so that no one has to pay before they read something that might not be what they really are seeking. But if you find it meaningful and helpful and would like to contribute whatever easily affordable amount you feel it is worth, please do do. I will appreciate it. The button to the right will take you to PayPal where you can make any size donation (of 25 cents or more) you wish, using either your PayPal account or a credit card without a PayPal account.

The Immorality of Giving Tests for Grades in Teaching
Rick Garlikov

I will try to show that it is morally wrong to give tests in order to determine grades while teaching, and I will try to show it is morally wrong to give tests in order to determine certain types of qualifications. I realize that grading by testing has been done so long and so widely that it seems acceptable in part because it is normal. It is normal, however, only in the sense of being common and pervasive, not normal in the sense of being reasonable. I also realize that grading by testing seems to be a reasonable thing to do, and that it seems to be necessary and helpful in society. I will argue that the appearance is not the reality.

I am not arguing that there are no (morally) valid uses of tests. Testing is all right in order to determine what someone seems to need yet to learn. I am arguing that for grading purposes in teaching and making certain kinds of certifications, there are no (morally) valid uses.

Let me make it clear from the outset, however, in case anyone questions the motivation of this paper, that I, myself, always did well on both school tests and standardized tests. And my children do well on school tests and on standardized tests. I had straight A's in high school in all the courses that determined class standing, and straight A's my first three semesters in college at an academically highly rated university. After I quit worrying about grades, I still made all A's and B's in college. (I did get one C in college -- during the transition of being overly concerned about grades to being far less concerned -- in a course I should never have taken for which the material was almost impossible for me to learn; and if that grade was inaccurate, it was only because it was too high.) I also made what were considered by institutions to be commendably high scores on their standardized qualifying entrance exams. So this is not a personal harangue about the unfairness of grades based on tests because I was some sort of victim of tests; I was, in fact, virtually always the beneficiary of rewards based on tests. Yet I contend grading and bestowing rewards by testing is wrong.

The two main reasons testing for grading purposes while teaching is morally wrong are that (1) it often makes a mockery of teaching (and learning), and (2) it bestows important rewards even though (A) testing does not show what it seems to show, and though (B) even if it did show what it seems to, it would still not be the right basis on which to bestow those rewards.

The closest testing can come to measuring anything is to see who has memorized, by some particular arbitrary date, some particular bit of material or some algorithm for doing something, such as working through a math or physics problem of the sort that has been taught. But it does not necessarily do even that, since one can make an error or get confused or have a temporary memory lapse during a particular testing period even though one has learned the material. Or one can know material in the sense of being able to use and apply it in any normal or useful circumstance, but just not be able to come up with a particular way of saying it apart from any context in a merely cold recall situation. And, in those cases, where a student has learned something prior to a course, or in some other way than from the teacher or study in the course, the grade does not signify learning had occurred from being taught in the course. Poor test results on certain sorts of tests sometimes even result from disdain or indifference; students will often "tank" or intentionally do poorly on certain kinds of standardized tests whose results are personally unimportant to them but which supposedly reflect on the teaching quality of their schools. And students will also test poorly in those cases where they try to write what they believe (whether correctly or mistakenly) the teacher wants them to say even though they do not understand it or agree with it. Moreover, if someone learns something after a test but learns it better, can use it better, and retains it longer than a person who learns it just before the test and forgets it soon after, the latter will be the one who gets the higher grade and be thought to be more competent.

Test results do not necessarily signify whether someone knows important or useful information that they can use in any meaningful way in significant circumstances. Nor do they necessarily signify whether someone has a particular ability or talent. Tests can indicate that someone seems to know or not know (how to do) something, but the test is not, by itself, necessarily conclusive. Yet grades are based on them as if they were conclusive. A more detailed explanation of the logic of the inconclusiveness of tests is given in the essay "Understanding, Shallow Thinking, and Schools". And it is related to the arguments examined in even more detail and precision in the essay "Scientific Confirmation."

Tests tend to be merely convenient ways for institutions to make judgments that seem to work for them. However, as many institutions have found, making judgments based primarily on test results can prove to be a costly mistake. At least one medical school, for example, once they found that they had a high drop-out rate from students who had made some of the highest scores on the standardized Medical School Admission Test, began to search for a better way to try to accurately determine which applying students might have the best chance of successfully becoming good physicians. Conversely, many people in many different areas of life have been overlooked by someone whose "qualifying" test they failed, only to succeed enormously because someone else gave them an opportunity. Athletes who do not make the cut with one club often become stars when given the opportunity with another. Numerous music teachers are said to have refused to give the young Enrico Caruso voice lessons because they said he had no singing talent. Einstein barely received his PhD in physics and was not hired to teach anywhere upon obtaining his diploma. He took a job in a patent office, and worked during his spare time there to develop the Theory of Relativity.

While people who do well on particular tests (or in particular circumstances) may go on to success, it seems to me highly probable that potential talent is lost by casting aside those who do not do as well on those tests (or in those circumstances). I had a student from an inner city school in a junior college philosophy class one time who wanted to study education. I thought he was an excellent student and would make an excellent teacher. But his high school, and so far his junior college grades had been mediocre to bad. I called a local, prestigious college with a national reputation for excellence and asked them to interview him and to ignore his grades. They did, and they accepted him, and told me that he was the most impressive student they had ever met. That would never have shown up on his transcript. The lost benefit "opportunity costs" of essentially discarding people based on their particular test results seem to me to be potentially enormous. What is important is what people can achieve in general or when all is said and done, not just what they can achieve in a particular situation at a particular time -- especially when the particular situation is artificial, arbitrary, and contrived, and not generally a reliable indicator of potential or actual ability.

The above reasons all have to do with the potential (and frequent) fallibility of tests. If tests could be made to be infallible, and demonstrated conclusively to be infallible, ways of determining whether someone knows something or has a particular ability, the above objections would evaporate. But that is not likely to happen. The next points have to do with the practice of rewarding good test scores and penalizing people for poor test scores -- particularly, but not only when tests are fallible.

When students do poorly on a test, it might be because they were taught poorly, but it is they who are punished for the poor results, not the teacher. Conversely, when state departments of education want to judge teachers on the basis of student test results, that is often unfair because student performance might be for any of a number of factors unrelated to teacher ability and performance. Often schools that score the best on standardized tests have students from stable communities, with relatively affluent, educated parents who emphasize the value of education, and those schools that do the worst have students from home and community environments that are not the most conducive to academic learning. If some, and only some, teachers consistently elevated students from disadvantaged circumstances, we would have reason to believe those were really good teachers, just as we might have some reason to believe that teachers were not very good if they did not help students from privileged environments score very well on tests, but those sorts of situations do not arise often, and even when they do, it is still open to question whether the teachers were the prime or determining factors. In social settings, it is very difficult to isolate or control the variables, and non-teaching factors play a great role in influencing students' standardized test scores. Therefore, dispensing rewards and penalties on the basis of test results is not necessarily a just or reasonable way to distribute them.

But even if tests could be made to be infallible signs of ability and potential, one still has to be careful about bestowing benefits or privations on the basis of test results because it is not reliability of test results alone that determine one's value or merit, but reliability along with relevant importance of what is being tested. Let me give some obvious examples first.

It has become fashionable in the last few years for economically successful companies to hold well-publicized contests where the winners receive huge amounts of money. There are quiz shows on television and there are "tests" (in the form of contests) during the half-times of major sports championships. Usually these involve choosing someone to try to do something such as throw or hit a ball into or through a hoop or hole. The contestant gets one try and if s/he succeeds s/he receives a huge amount of money. While this is great entertainment, it seems a bizarre way to determine who in society deserves a lot of money. The fact that someone makes the shot at a particular time or misses it, seems hardly the proper way to evaluate whether they are deserving of a fortune or not. The contest at the 2000 Nokia Sugar Bowl struck me as particularly odd in this regard. An ordinary person was the contestant, but his reward was based not just on his performance but also on the performance of a retired professional quarterback. The quarterback had four throws to get a football through an opening a certain distance away. Each successful throw was worth $50,000 to the contestant on whose behalf the quarterback was throwing. The contestant then was given one throw himself, which, if successful, multiplied the amount already won by 10. Since the quarterback made one of his four throws, the contestant's successful throw won him $500,000 instead of $50,000. Had the quarterback made all four of his throws, the contestant's successful throw would have netted him $2,000,000. So here was a test in which a person was rewarded a great deal even though he only contributed part of it, and even though the reward is hardly commensurate with the social value of the accomplishment itself - - the luck/skill of getting a football through a hole on a one-time throw.

In these sorts of contests, basically the person "deserves" the money only in the sense that he followed the rules and achieved the result the rules required for the money to be his. But in a larger, more important sense of "deserved" -- a sense having to do with justice, proportionality, commensurate reward for contribution, etc. -- he does not deserve the money because the rules themselves by which he earned it are faulty and irrelevant to any sense of social justice or merit. The rules of the contest are faulty and irrelevant in regard to this more important sense of being "deserving."

Consider next this same situation, one step differently. Instead of giving a person only one chance to receive a valuable social benefit for a one time athletic accomplishment, suppose we give him/her multiple opportunities, which requires some level of overall accomplishment either in total or on average. While this is a truer test of the person's ability, still it seems an odd way to bestow social benefits. Yet, we do that all the time when we give students college slots and college scholarships for their athletic test performances. And, I contend, we do essentially the same thing when we award college acceptance and/or college scholarships to students on the basis of their academic grades.

Now it is easy to make the case that college is not the appropriate place or reward for some people -- those who do not wish to attend classes or study, and who would get little out of doing so -- to play a sport in a way that is essentially professional, and that instead there ought to be professional minor leagues for these people to develop their skills and earn a living. But I want to try to make the stronger and less obvious case that it is also not right to award college enrollment for academic performance based on grades determined by tests, even though college courses are academic in nature.

The recent re-emergence of huge money quiz shows on television networks gives a hint of why this is. These quiz shows are very much like the half-time sports contests described above in that they give someone an opportunity to win a lot of money based on a relatively short test or contest. In one recent program, it was worth something like $125,000 to know how many strings were on a normal violin. My daughter, who takes violin lessons, knew the answer, of course, but the contestant did not. My daughter had a fit that the person did not know, and she also had a fit because she thought her knowledge was worth $125,000 and yet she was not given the opportunity to earn it. My view is that this particular piece of knowledge ought not to be worth $125,000 because it makes no contribution to society other than as 25 seconds worth of entertaining suspense and because it would earn someone the equivalent of what it takes someone making $10 an hour doing something worthwhile 12,500 hours --more than six years-- of labor to earn. It bestows a much higher reward on the person who has the accidental knowledge of trivial information than it does on someone who labors at harvesting crops or building highways or teaching children.

It does not change the situation in any relevant moral way to make the requirement for the reward be not just a one-time performance, but an overall performance for a number of years. Students who learn an abundance of trivial or esoteric information that allows them to give correct answers on tests over a period of 12 or 13 years do not necessarily deserve the abundant rewards from society that college enrollment and college scholarships essentially bestow. It is true, of course, that they are working (in some cases working hard) to learn those things, but so is the athlete working (in some cases working hard) to develop his skills. The hard work is not by itself indicative of merit of a substantial reward from society. Working hard to do something which is of little benefit to anyone, ought not necessarily to be financially rewarding. The reasons and mechanics are too long and complex to go into here, but it is merely an accidental, though entrenched, characteristic of our economic principles and system (coupled with what science and technology at any given point in history makes it possible to mass produce) which allows activities of relatively little significant benefit or contribution to society to earn great income. But that does not need to be compounded by consciously bestowing additional rewards that do not need to be given.

Of course, if school testing success, coupled with a college education, were indicative of future accomplishment and contribution to society, it would be reasonable for society to bestow college opportunities, and the ensuing additional benefits of college degrees, on good students. But that is arguably not necessarily the case. The best students, grade-wise, do not generally go on to the most lucrative work; and financial success is not always indicative of accomplishment or contribution anyway. There are no obvious, real good (automatic) measures of those things, but I think it is fairly easy to make the case that just because someone can do well on the academic equivalent of a decade of quiz shows, they do not necessarily deserve the benefits of a college scholarship or a college education more than someone who does not do as well.

It is, of course, not that students who test well in school are undeserving of college attendance and college scholarships, or of future lucrative job offers based on college grades determined by tests; it is that their getting good grades based on tests is not a reasonable sign of their being deserving of these things. It is not a sign of how much college will help them benefit humanity more than it will help someone with lesser test grades. Good test grades will not even signal who will learn the most or get the most out of college, let alone do something with it. This is particularly easy to see when colleges take in older ("non-traditional") students who had poor high school grades, but who became motivated to learn after they had left high school and had matured later. Such students often do far better, and learn far more, than their earlier grades would have indicated. In the reverse, poorly motivated students or emotionally immature students tend to have trouble with college even if their high school grades were good and they have just graduated. Colleges have begun to look at performance other than grades, but they need to accelerate that effort and they need to find indicators (and thereby set standards) that have more to do with a student's desire and ability to learn for the sake of knowledge and wisdom (not a grade), for the sake of being able to make a future contribution to society, and for the sake of personal, professional, and social development.

There is a case different from all the above, however which is also ethically wrong for a different kind of reason. There are some teachers who proudly claim to use graded tests in a way that seems particularly egregious to me - - those who boast that they give tests that teach students important things about the material while they are taking it, what they often call "teaching tests". Normally these are tests that require students to use material they have learned in such a way that they can achieve an insight into it or perspective on it that they had not realized before, and that they might not ever have realized without confronting such a situation or question as the one raised by the test. When such tests do teach what they intend to teach that is a good thing in terms of being educational, of course, but since such tests, when they work this way, do so normally only for a few students, they are an unfair teaching tool because they penalize those students who do not learn from it for the failure of the method to work as a teaching tool. Teachers who withhold or "save" the sorts of questions which they think are "teaching" or "educational" questions until giving them to the students on a graded test, are essentially not teaching all they can to students during their teaching time, particularly when the test in question is a final exam, so that those who do not attain the insight expected will not have been taught by the teacher what the teacher thinks they should learn. And that is what makes this particular approach immoral; it is not teaching, but the intentional withholding of instruction. If teachers have a question or challenge to students that they think will foster insights students ought to have, they ought to present it to the students during instructional time when it can be discussed after the students have responded, not when it is too late for the student to benefit from it, and not when their not being benefitted by it actually penalizes them.

Finally, but not of least importance, it seems to me that grading based on testing sets the wrong, artificial, and too low standards for students, even when the testing is stringent or difficult. And it often also "burns students out" and makes them not want to do their best over the long run. Tests "teach" students to work for extrinsic reward, not for more meaningful reasons. What you want is for students to be nurtured to do their best to learn and to be creative on a daily basis most of their lives, not to be nurtured to learn to cram for certain special occasions and to then coast or be lazy the rest of the time because they are either exhausted or because they see no need to work other than to pass tests established by someone else for which there is an extrinisic reward. Insofar as students can score well without much effort, they have little incentive to learn more than they have to. And if there is substantial work involved in getting good grades, but that work is not the best kind of work to do apart from grades, students who are motivated by grades will do the kind of work required for a good grade, rather than the overall better kind of work. For example, students will often just want to know how to work certain kinds of math or science problems, not care to try to understand the underlying principles involved. Tests foster the "I don't need to know that for the test" syndrome, or the "That's not part of the assignment" syndrome. Or the "how many words/pages does this writing assignment need to be?" mentality instead of the notion that one needs to be as thorough and complete, or as creative or inventive or insightful about the topic as one can. Tests do not only fail to distinguish merit from non-merit, but they establish the wrong goals for which students need to strive.