Student Assessment: August 2011

Wednesday, August 31, 2011

Questions Answered.

Q: How many alternatives should a multiple-choice question have?

Unless you have a specific subject that demands more than five possible alternatives (say, doing a unit on the solar system, having all 8 planets as alternatives might make sense -- though a matching question would probably work better!) don't do it! The more alternatives, the worse your question is likely to be. Four or five alternatives are standard on professionally designed test (since statistically, this reduces the score students could get by pure chance to 25 or 20% respectively). Some people want to pile on alternatives to make the questions 'tougher' and to 'eliminate chance'. But here's the thing -- having 7 alternatives does reduce the chance of them getting the question right by blind guessing, but why bother? Once they've gotten less than 20%, how much more failed do they need to be? And while theoretically reducing the impact of chance on getting the right answer, the more answers the student has to read through, the more it becomes a readiing test rather than a test of that subject matter. So do we really want to eliminate chance factors by penalizing poor readers, ESL students, and so on. Most test designers agree the trade off isn't worth it!

And in the real world, coming up with 7 or 8 credible alternatives becomes REALLY hard. Again, unless there are an obvious 8 possible choices like the 8 planets of the solar system, you will drive yourself crazy trying to come up with credible but clearly wrong answers six, seven and eight. Why do this to yourself?

Or, people will do terrible things like having alternative 5 as "A and B, but not C". No, no no! ? Never do this! It becomes a test of reading and logic rather than subject knowledge. Students will hate you with justification since the test will not be an accurate reflection of what they know -- indeed, some research suggests that this INCREASES the importance of luck...

But here's the killer -- research in the early 1990s suggested that the overall quality of tests DECLINCED with the increase in number of alternatives per question. Everyone who has ever designed an mc test knows that coming up with the answer is easy, the first two wrong answers pretty easy, it's alternative 4 that's tough, and #5 is almost impossible -- the higher you go, the more desperate one becomes to fill the last spot. So, a test with seven alternatives will reduce the writer to grasping at straws, and they will end up accepting ridiculous alternatives that even those completely ignorant about the subject will have no trouble eliminating -- a complete waste of space and student reading. And then -- this is where human nature gets interesting -- since I've given up and accepted a stupid alternative for this question in desperation, my standards for writing the next question go down, because even though I know this is a terrible alternative, it is not as bad as the last one. Or, having accepted three weak ones, what's one more? Pretty soon, the test is garbage.

In contrast, tests with 3 alternatives (the right answer and two wrong alternatives) turn out to be easier to write, and therefore are written to a much higher standard. Students perceive them to be much tougher tests! And, research says, they really are more valid and reliable! So, I tell my students to write questions with three (good!) alternatives rather than going for four or five. Professional test designers can go for four or five because we have the time to come up with high quality 'd's and 'e's, but the realities for classroom instructors is that that is not going to happen.

It's true that with only three alternatives, students can get 33% just by blind luck, but um, so what? I don't know any course where 33% is a pass. Failed is failed. And the results of this test will more accurately reflect what students actually know than one with 7 or 8 alternatives.

Questions Answered.

From time to time strangers email to ask a question on test construction, which I do my best to answer. If they are the sort of questions that I get a lot, I add them to the "Frequently Asked Questions" file on the test construction site; but I think I'll highlight some of them here in the blog as well.

Q: Where is the best place to put the correct answer? For example, if I provide seven choices, does it make a difference if the correct answer is choice 'b' instead of choice 'e'?

A: Professional test designers place the answers randomly -- I mean that in the literal statistical sense of the word, not 'wherever'. They use tables of random numbers, or complicated computer programs that assign the answer randomly, to decide which spot will hold the answer to each question.

What they do NOT do is place it themselves. Research shows that left to our own devices, most people will attempt to 'hide' the correct answer somewhere in the middle of the list. (Nobody wants to put the right answer in A, because then the students won't even read the other alternatives you worked so hard on; and putting it in 'e', it just sort of seems to hang out there over the edge. Sticking it in the middle feels right! Even though, that's wrong.) Even experienced test construction professionals will unconsciously choose 'c' (or for some individuals, it turns out to be 'b') 3/4 of the time. That's why the rule for taking an mc test is "when in doubt, choose 'c'" -- because unless one takes care to distribute correct answers to get an equal distribution of A, B, C,D, etc, there will be way more 'b's and especially 'c's than other answers, so testwise students can do quite well for themselves simply by answering 'C' to every question. That's why professionals force themselves to do it randomly by using computers or tables of random numbers. And then they'll double check at the end of the test to make sure they have roughly equal number of a, b, c, d, etc.

For classroom instructors etc, I wouldn't bother with tables of random numbers (which are kind of a pain to work with) and let the answers fall where they may by pyramiding questions. To stop students from trying to figure out which answer will come next ("there have been three 'd's in a row so next one must be something else") you let the internal logic of the question dictate placement. Numerical answers are listed in ascending or descending order; dates in chronological order, single word answers are listed in alphabetical order; sentences either on the basis of some internal logic or more usually shortest to longest or longest to shortest. (Incidentally, this also makes the test look really pretty! People who don't pyramid their tests have really ragged looking questions). So if the correct answer turns out to be the longest, it places itself in the 'e' slot, not letting the designer 'hide' it in the middle 'c' spot. After the initial draft of the test is done, one quickly looks through to ensure one has equal numbers of a, b, c,s etc. Where there are too many of one, say 'A's, you go through and change some of the ascending questions to a descending to move the 'A' to a "D" or whatever. It works pretty well!