Follow-Up to "The Concept of Teaching 'To' the Test"

This work is available here free, so that those who cannot afford it can still have access to it, and so that no one has to pay before they read something that might not be what they really are seeking. But if you find it meaningful and helpful and would like to contribute whatever easily affordable amount you feel it is worth, please do do. I will appreciate it. The button to the right will take you to PayPal where you can make any size donation (of 25 cents or more) you wish, using either your PayPal account or a credit card without a PayPal account.

More About Evaluations: Follow-Up to "The Concept of Teaching 'To' the Test"
Rick Garlikov

There is a school of thought prevalent (possibly even dominant) in education today (including college courses) that for a course to be fairly graded, students have to be told from the beginning, in ways they understand, what is expected of them to get a good grade. On the surface, or with certain caveats, that is a reasonable idea. However, it is not normally reasonably interpreted by many students, and it is difficult to tell whether it is reasonably interpreted by many administrators or not, because student evaluations of teachers in regard to this seem to have some (usually unclear) importance to them even when based on unreasonable judgment. I include this as a follow-up to the essay in the title because it seems to me to be a variation of the conceptual confusion surrounding the concept of "teaching to the test" -- as in "teaching to the course directions, expectations, or rubric".

The problem is that many students want to know so specifically what they need to learn or (be able to) do, that they are essentially asking for what the answers are or where they specifically should look to find them, and they want any questions or assignments to be worded in ways that make it easy for them to just pick out the answers if they look where they are told. While that is fine for teaching something like engine dismantling and rebuilding, which has particular ordered steps and only those steps that need to be learned (in their entirety) correctly, it does not serve for teaching more general or abstract concepts, ideas, and principles from which to deduce particular answers. For example, if the idea is to test to see whether students can add or subtract in general, it does no good to tell students ahead of time which numbers they will be asked and expected to add or subtract correctly on the test, because the lazy or less responsible and/or less curious students (for want of a better or more neutral description that would seem as appropriate) will simply learn those particular answers and feel aggrieved if a different set of calculations are asked or if a "word problem" is given that requires subtraction without saying subtraction is necessary or what to subtract. And if you just tell them the test will be on "addition and subtraction," that is considered by many people to be too unspecific to be fair. So the idea of describing course student expectations and "teaching to them" to make grading of students fair seems to me to be very similar to the issue of "teaching to the test."

As the subject matter for a course becomes even more general, or involves more general principles, concepts, or ideas that students are meant to study until they understand them and can apply them in an open-ended variety of situations (such as "word problems" in math, which require figuring out which calculations to use, as opposed to doing calculations already set up for the student), students seeking the wrong sorts of specific direction become lost and indignant. Their criteria for what is fair and reasonable is itself unreasonable and unfair. Their criteria cannot reasonably then be what should be expected of teachers. The pedagogical problem is that general or abstract principles, concepts, and ideas often require concrete examples in order to explain them, but the concrete examples are not themselves the concepts, principles, or ideas; the concrete examples only embody them. Students need to think about the concrete examples in order to extract the abstract ideas from them to have the insights intended. Then the description of the principle, concept, or idea will make sense. Students not inclined nor motivated to "think" in those ways then have a difficult time learning or even understanding what there is to be learned. Higher order thinking skills are not something that can be taught by rote or recipe or spelled out in bullet points. Students who think that is unfair may as well say that trying to discover a cure for any disease or invent a solution to any problem is unfair because God or nature doesn't tell us how to do it, where to find the answer, which questions to ask, or specifically how to figure it all out. Higher order thinking skills can be exemplified, modeled, and abstractly explained, but not described by a set of lower order thinking skills that spell out specifically how to think or solve problems, which is what students often want and consider to be telling them what is expected of them.

This does not mean, however, that teachers have no responsibility to explain what is expected in the course, to explain the material in ways industrious, responsible students can understand, and to limit the scope of what is tested to what is reasonable for conscientious students to be able to grasp. While there may be some disagreements in borderline cases about what constitutes fair and reasonable expectations of students to have learned, clearly students do not need to be given directions so specific that they require almost no thinking or learning to follow to get high grades, and just as clearly the volume or complexity of material cannot be so overwhelming and the teaching of it so inadequate that few if any students can learn it well-enough to pass the course or get a fair or reasonable grade. It is not just an issue of volume and complexity of material, but also of whether the teaching is adequate or commensurate for the expectation of the students' learning material of that amount and complexity to be fair. And lazy or less responsible students cannot rightly be the judge; nor their results, the criteria.

That being said, it may be unreasonable to put students with different levels of responsibility from outside the course in the same course with the same expectations or to make students with greater outside responsibilities, or allow them to, carry other courses at the same time (which is essentially adding to their responsibilities outside of each course). In my most recent teaching, online, many of the students are not "traditional", meaning they are not students going to college full time right after graduating from high school. Many students have families and full time jobs and may also be taking too many courses, trying to do well in each of their roles. It would be unreasonable to expect them to spend as much time studying as a single, non-parent, student who has at most a part-time job as their only other on-going responsibility. But since it is reasonable to expect them to learn what is necessary to justify receiving credit for knowing what they are supposed to know to justify their degree or certification, courses, course loads, counseling, and curricula should be designed to accommodate both the time constraints of students and the academic meaningfulness or significance of passing a course or earning a degree, which should be an honest acknowledgement that a student has learned what s/he should. For example, students with more time to study might be able or expected to take more concentrated courses, whereas those with less time to study may have to take courses where the material is spread out more in time or divided into two courses. If they have less time per term to study, they may have to take more courses over more terms to gain the requisite knowledge and skill.

Different Purposes of Testing: Motivating Study, Weeding Out (or Grading for Ranking), and Enhancing/Facilitating Learning/Teaching

The idea of evaluating students based on whether they have met expectations that were explained to them from the beginning usually is thought to involve testing them via exams, and so, again, this is related to the idea of "teaching to the test", where part of the teaching is explaining to students what they are to learn and will be graded on, not just simply teaching them material all of which they are supposed to learn and be able to use, and all of which they are responsible for being able to answer satisfactorily on an exam or any other sort of test. In a way, the simple answer to the question "What are we responsible for learning in this course?" would be "All the material presented to you in the course." But again, for some reason, many students today (and at least some administrators apparently) think that answer too vague, unfair, and inappropriate, though it was not always considered that way.

There are ways to evaluate student learning without giving formal exams or quizzes or assigning papers; one can do it through daily dialogue for example with well-crafted questions or comments. And giving exams in a course can have different purposes, not just to evaluate student learning: 1) extrinsically motivating students to study for a high or passing grade, if they are not motivated intrinsically by interest or curiosity in (learning) the material for its own sake, 2) denoting those students not ready to go on, by failing those who have not achieved the minimal knowledge or skill level they should have and determining which students are better or worse students in a course for some sort of credentialing purpose, presumably based on how much each has learned (often called summative testing), and/or 3) facilitating learning and teaching by helping students and teachers know any deficiencies that need to be remedied (often called formative testing).

I have written extensively about the problems of testing for ranking or even for ascertaining student knowledge or ability (www.garlikov.com) as have many other people, so I am not inclined to formally test students with exams in order to rank them or even to know whether they have gained minimal knowledge necessary. I am interested in using much back and forth dialogue to ascertain student knowledge and ability in order to correct any deficiencies or mistaken ideas they have. And until recently I didn't believe in using tests or exams to motivate students extrinsically to learn. I always believed that I had the ability to make subjects interesting and challenging enough to get anyone interested in it. I had much success doing that for years in my classrooms and in presentations to people of all ages, from young children to senior citizens, and still do among many students. But I have found what seem to be an increasing proportion or number of students in college these days who seem motivated to study difficult material primarily or only for a grade, and unfortunately they have often been rewarded with good grades in the past for minimal effort or learning. So even when tests are announced and the material to be tested is described or explained, many students are still not particularly motivated to put in the kind of effort needed to learn conceptually difficult material or to try to figure out what they are supposed to do. Part of the problem, I believe, is the notion, as stated at the beginning of this essay, "that for a course to be fairly graded, students have to be told in ways they understand, what is expected of them to get a good grade" when that expectation has to be described so concretely and/or so simply that it can be met with little or no reflective effort on the student's part to meet it or know what it even means.

And it is difficult to ferret out what the specific problem is with any given student who is not doing well, because students are often reticent to tell teachers what they don't know or understand, either about the subject matter or about the course expectations of them. Often they mistakenly think everyone else understands the material and what they are supposed to do, and it would be embarrassing to admit they themselves do not. Often they think the teacher will think them stupid if they let the teacher know they don't understand it. Often they will think the teacher will not be interested in explaining it in a different way or that if s/he does, it will be with disdain or still at too high a level to be intelligible. There are many issues to try to navigate in teaching students who are reticent to say much about the subject matter itself or how they go about studying it or what they are trying to learn. And in particular, it is difficult to tell whether students who don't do well in just giving assessed answers on test questions, without much elaboration, are lazy, negligent, careless, unaware of what they are supposed to be studying and learning, or just finding the material difficult for some reason the teacher needs to understand and overcome. This is particularly true when test questions or evaluation criteria are primarily "one-way" in the sense of asking the student something but then not having back and forth dialogue to go beyond the surface of that answer. As I wrote about elsewhere, my older daughter's third grade teacher gave a test one time where students were to tell whether a group of words constituted a sentence or not. The teacher told me all the students missed the statement "Tom is sleeping" saying that it was not a sentence. She then said "I don't know why they all missed that." I said "Why don't you ask them why they said it was not a sentence?" She didn't understand the point of that and didn't do it. I asked some of them outside of the class why they thought it was not a sentence, and they said "because the teacher said sentences had to have 'naming' words [meaning the subject of the sentence] and 'action' words -- and Tom was not doing any action; he was just sleeping. So there was no action word, and it was not a sentence." There was nothing wrong with giving that group of words on the test, but there was something wrong with not following up to see why the students missed it (whether any students or all students), and there was something wrong with using it as part of the students' grades, because it was not their fault they missed it, as following up properly would have shown. The explanation they had been given to decide what made a group of words a sentence or not was ambiguous in a way that was accidentally brought out by this test question. The teacher should have asked why they missed it, and even if she didn't, the students should have protested its being counted wrong, given their understanding of the teacher's explanation. Students should have said something like "But you said sentences needed an action word, and there is no action word in this sentence. He is just sleeping, not doing anything." But without follow-up conversation after the test question is answered, and without an explanation given in the answer itself, students are just marked as being wrong in their answer and the course goes on without genuine teaching or learning about that particular topic.

Now, using tests properly is not always easy to do, particularly with reticent students. I believe I made a strategic pedagogical error in the design of a test the last number of terms I taught. This was for an online course, and its being online may have exacerbated the problem or at least not allowed for it to be ameliorated or prevented, because there is not the volume of feedback in an online course that there is in an onground one. [See Online Versus Onground Teaching (Word document) (HTML Webpage) for the full explanation of that, but the main point is that online courses allow/require higher quality students responses and teacher feedback to make up for loss of volume of the typical lower quality/higher quantity of feedback in an onground classroom. The greater volume of feedback onground is because feedback is immediate onground in a group all there at the same time. Online courses tend to have time-lapses between communications and students are not in class at the same time.] I was required to give at least one exam that was proctored. Normally I prefer to teach without giving exams (unless students request them because they think their grade otherwise incorrect, or unless a student's performance is borderline or insufficient for me to know from class what s/he understands and knows), but instead simply evaluating on a daily and overall basis student performance and responding to what they are saying that is correct or mistaken, and trying to foster their ultimately having correct what they are supposed to know. In the past that pretty much worked, and no students who requested an exam ever tested higher or lower than expected. And up until relatively recently, the only students who even disputed a grade were students who argued that I was grading them too high, and that they deserved a lower grade. Today's students are on average not like that. Many don't want to have to learn any more than necessary for a grade where they are told (too) specifically ahead of time what will be tested or required in a paper or other assignment. And they only want to have to memorize answers to be given back to the teacher on an exam, or to be told what to find and where to find it in order to state it on the exam.

Insofar as that is the case, as it seems to be, it makes it difficult or impossible to design a test that will motivate them to learn more than you can test and tell them in ways they easily understand what you are testing. The last few terms I designed what I mistakenly thought was a clearly described and clearly fair test which was so comprehensive that in order to answer the questions on it, students would see they have to understand all the material that was being covered, and be able to tell when they have done that. Plus, the material being covered was essential for them to know early in the course in order to do well in the course as it progressed. The test was given after the third week of a nine-week term. But I gave them all a copy of the exact test from the beginning of the term so that they could see exactly what they had to know. And they were allowed to mark the answers on their copy and simply copy them onto the exam when they took it. It was all multiple choice, multiple answer, and true/false types of questions. Many students still did poorly, and those who did usually complained they didn't know what was expected of them in the course, and that I had not told them -- even though much of the test was on the material that explained what was expected of them and which was repeatedly stated during the discussions whenever anyone failed to do what was required. Those were also the students who gave minimal answers to the discussion questions each week and who did not respond to challenges to their answers in order to defend or amend them. They were also the students who did not read, as they were supposed to, their classmates' responses and my replies in order not to make the same errors already pointed out. So basically it seemed that a number of students taking online courses did not like to read or did not do it well, and then said the course expectations were not clear and were thus unfair. For the most part, with an odd outlier here and there, students who did well on the exam also did proportionately well on the discussion questions and received proportionately higher course grades.

So my thinking is now that 1) I was mistaken that students would study all the material in order to be able to answer the questions correctly that were given ahead of time, 2) I was mistaken that they would see they needed to study, for their understanding, all the material in order to get the right answers to those questions, and 3) I was mistaken they would be motivated to learn all they could so they would learn more and do well grade-wise. So this term, I am trying a different approach that involves changing the exam (which is only a portion of their course grade) to make it more motivating for fuller learning, and only moderately summatively or formatively evaluating, since I don't see how to do all three with a test -- because if you tell them too much of what they need to know, they don't learn any more than that, and if you tell them too little, they mistakenly claim it is unfair. The questions will not be given to the students ahead of time, but the location of the body of material to be covered will be pointed out to them. For extrinsic motivational purposes (i.e., a passing or a higher grade), they will be told they are responsible for all we cover. But to make the exam be fair summatively, and actually even more than fair, I am making the questions quite easy for students who pay any attention at all to what is going on in the course for the seven weeks prior to their taking the exam (this time to be given near the end of the course, in week 8). Hopefully all students will do well on it, but that was the hope for the previous terms for that exam procedure as well. Since the exam is only a portion of their grade, and since what they show on a daily basis that they understand and can do is more rigorously evaluated and addressed, making the exam unexpectedly easy does not "dumb down" the course, particularly if it works to make them study harder because they won't know what will be on it.

The expectation of all students doing well, however, goes against a different purpose for testing, one common (I think, unfortunately common) to the idea of what tests are supposed to do -- serially rank and/or weed out students. A standard view of "item analysis" on tests is that either 30 to 70 (or 80, depending on whom you read) percent of students should get a test item right or the test item is either too easy (if more get it right) or too difficult (if fewer get it right), and that questions should be able to distinguish better and worse students by their degree of difficulty -- where better and worse students are considered to be those who do well or poorly on the test. This makes no sense to me in regard to classroom teaching, where the goal should be to make sure all students know the material as well as possible, and if you have all industrious students who are competent, the material is appropriate for the course level, and the teacher teaches it well, they should all be able to get A's. Moreover, if the students are not industrious or responsible and do not study, there is no reason to design the test so that at least 20 or 30 percent of them pass all or any items on the test. Moreover, insofar as students learn or fail to learn what they should know, there is no point in devising a test that ranks them in a hierarchy that shows otherwise just because you can contrive questions that will do that. And there is no point either in "dumbing down" the course material or expectations so that some students can pass the course or get a good grade in it to make it look like they are competent at a difficult task when in fact they are not.

For example, if a test were to be about the steps in taking apart and reconstructing an automobile engine or doing a heart or liver transplant correctly, that test will be difficult simply because the subject matter is difficult. But if you give the test to industrious mechanics students or to Harvard medical residents in cardiac surgeons, if they all pass it perfectly that will not show the test is therefore too easy. The test is difficult no matter what the grades on it; and if the students are all good, and all learn the material because they study and it is well taught, that does not then show the items on the test are too easy and the test is too easy or flawed. And if none of the students in a less industrious or stellar group learn how to do those tasks, it would be silly to have a test that is so easy that somehow it certifies some have learned it. Creating tests that arbitrarily rank students is a pointless goal. While there may be a need to assess (and then as a natural result serially) rank student learning because not all students learn what they should, that is a different matter from making test questions meet an arbitrary standard of difficulty based on circularly causing such a ranking artificially.

But my courses are philosophy courses (normally introductory level ones), where I am trying to get students primarily to be able to reason well in general and about certain topics in particular. That involves trying to teach them higher order thinking and communications skills. It is not amenable to giving them an exact description of what they need to know and be able to say or repeat on an exam. And since my goal in my courses is to teach all students what is reasonable to expect them to be able to learn with a reasonable amount of reading and reflection, I am not trying to rank students by test questions that artificially separate and order them into a hierarchy of grades. My goal is to teach and to intrinsically motivate them all well enough for them to do the work necessary so that everyone gets a deserved A. That doesn't come as close to happening online or onground now as it used to happen onground where almost everyone in the past onground got A's or B's, with an occasional C. Moreover, I was told years later that in one course I taught for three hours on Monday nights, students congregated in the parking lot outside and discussed and debated with each other the issues raised for an hour more. They did not do that for a grade. Nowadays, however, some students get F's because they don't even participate some weeks at all, and many students get C's or D's because they don't follow even minimal directions/requirements. The students who follow the spirit of the requirements and who do all their work in order to learn all they can still get the A's and B's.

Those are also the students who just seem well-motivated, industrious, conscientious, and responsible in other ways as well, such as the students who often have to go out of town on business where there is no internet access, and who thus turn in assignments early and let me know they will be out of pocket, and who find other ways to make up for that. Or students in the military in a war zone where they sometimes have to go on missions that require communication silence, but who still manage to get their work done well. Whereas poor students are often those who have a multitude of lame excuses for why they couldn't do their work in time.

Typically any reasonable exam will separate and correctly rank such students for the most part, but I have found that the same course in different terms will have very different distributions of quality of students; where in some classes most of the students are conscientious and get A's or B's, participating extremely well and being obviously stimulated and challenged by the material, but in other classes (taught at the same time and/or in the same way) students are less motivated, less responsive, and give much worse answers, and make much less progress. Hence, I see the role of the exam in my classroom more about fostering motivation of lazy or less responsible students than weeding out or ranking students. I already know from their discussions each week what each student knows or how well they reason. I don't like having to give such extraneous motivation, because I think the material itself should be interesting and clear enough to be stimulating, but since I am required now to give an exam, and since proportionally fewer students today seem intellectually curious or motivated, it seems to me that it is more appropriate to make exams motivational for study, than for using them to merely assess student knowledge or to rank students hierarchically based on an arbitrary degree of difficulty. In order to make exams motivate study (for the external reward of the grade), it seems to me students need to be told less what will be on them specifically, rather than more.

And it seems to me students need to be told less specifically what needs to be in their papers they write or what they should do for other assignments, rather than more. They should be given sufficient information to know what the assignment is and generally what is expected. They should perhaps be given some good and some bad examples with explanations of what makes them good or bad. Quality should be stressed in general without just giving formulas that supposedly will automatically bring it about.

But teachers need to be careful about trying to be too descriptive or trying to be too "objective", such as in requiring too narrow a word-count range, or other kinds of quantitative substitutes for quality. For example, in the grading rubric for discussion questions, many teachers who seem to mistake quantity for quality (or who think requiring quantity will inspire or cause higher quality) require students to respond to two or to three other students' answers. But all that typically fosters is a plethora of perfunctory, meaningless comments, such as "I thought your answer showed good thinking and it helped me understand this question much better. Was it difficult for you to it? How did you figure out what you did? Really good job." Such responses are a waste of time to write and they waste everyone else's time to read. They do not serve to improve understanding of the material by either explaining what was wrong with the first student's answer or by giving more evidence in support of it. Nor does it expand any good answers given in the original answer. A good response should do at least one of those things. If a response cannot do that, it is better not posted.

So while evaluations within a course need to be fair, and expectations for a course need to be articulated, that does not mean that expectations need to be described so simplistically and tested in ways that give undeserved high or passing grades to those students unwilling to work or think for themselves to understand and apply the material. And exams and evaluations can be fair without giving exact exact specifications about how to do them successfully or what the answers should be. And in some cases, particularly evaluating higher order thinking skills or creativity, it defeats the purpose to give the students too specific description of what they are expected to be able to know or do. That is one of the problems with using standardized tests to evaluate students for higher order thinking and communication skills, because mercenary companies or teachers trying to artificially create higher test scores among their students, co-opt the tests to teach too closely to them, and if the tests are not changed sufficiently each time they are given (which they normally are not) students can use lower order skills to give the appearance of higher order ability. As in teaching to any test, if you are given the question and the answer ahead of time, and you simply can remember it for the test, then no matter how complex or difficult the question would be for someone who has not seen it before, your answer does not show the same thing about your ability as would theirs even if you both give the same answer.

I used to teach my children different kinds of things while driving them various places, and one of the things I did with math was to give them math puzzles. One such exercise was to give sequences of numbers that they had to detect the pattern to figure out the next number that should be given. They were pretty good at that, and I chose sequences I thought within their grasp. One day, just to make it more of a challenge and hopefully more fun, I alternated the numbers in a sequence so that there were two patterns -- one for the first, third, fifth, etc. numbers, and one for the second, fourth, sixth, etc. numbers. They saw through it after a little bit of thinking and got it right. It took them a few minutes. The next day in school, when the third grader was answering all the math questions too quickly for the teacher's taste, the teacher decided to stump her and gave such a sequence just to slow her down a bit. It didn't. She got the answer right away. The teacher was astonished, amused, and exasperated at the same time and asked my daughter how she got it so quickly. My daughter just said "My dad gave me a problem just like that in the car yesterday." The teacher was somewhat relieved to know she wasn't having to deal with a math genius, but the point is that my daughter showed me much higher level thinking skill in the car than she showed the teacher in the classroom, though until the teacher was told how she figured out the answer so fast, it gave the appearance of demonstrating the higher level thinking. Specificity of teaching to a test or to expressed expectations can skew the results of any test or evaluation procedure. It is unreasonable to have to spell out in specific detail what will be tested or what students should be expected to do, when such specificity defeats the purpose of the evaluation and gives artificially high, skewed results which make it look as though students have knowledge or skills they in fact do not.