Student Assessment: April 2015

[This post was in response to comments on the STLHE list serve, but I got a little too long-winded for length limits in that forum, so posted my response here for whomever was interested. ]

Quoting:

I don't think it matters to me that there is and there is not a natural bell curve, what i have tried to say that in large classes there seems to be a normal distribution that appears if the class is designed to have both failures and excellences. It is not a question of naturalness, but that there is a curve that appears where students tend to cluster around the midpoint of system of evaluation and it tends to look like a normal distribution model. Now that is in part from the design of the class, but what i also have said that there is agency of students across the whole spectrum of grades.

But um.... It is in fact a question of belief.

First, standardized tests produce a normal curve because when an item fails to produce a normal curve, it is pronounced too easy or too hard (or in some other way flawed) and removed from the test bank. When I worked for the provincial examination branch, the minister would phone down each term and say what he would like the test average and range to be that year, and I always hit within ½% of that target. There was only ever one test development specialist whose test was 20% off that target, and the next week he was literally removed from that position and placed in charge of the loading dock. Tests have normal curves because we expect and ensure that that happens (assuming test population is large enough) It is an artifact of how we choose to assess.

It should be obvious that if we use competency based assessment, we do NOT get a normal curve. If we are going for mastery of skills and knowledge, we get almost everybody to the same level of mastery--if we are any damn good as an instructor at all. I've given out like 6 Ds in 20 years of teaching, because one really has to work at screwing up my assignments--but nobody who looks at the product of our student assessments could say that our standards are low. Rather, because we have authentic assessments tied directly to the professional skills the students actually want to learn, we get success, not a lot of failure.

Normed reference assessment serves the purpose of sorting students from top to bottom, not assessing their competencies. Norm referencing makes sense only if one is 'filtering' -- i.e., adopted the 'talent hunt' model. In this model, one is seeking to identify that minority of students who may go on to become professional chemists or lawyers or engineers; the rest are considered chaff to be stripped out at the earliest opportunity. Some of us are offended at the idea that any student is 'chaff' (well, I did hand out six 'D's as hopeless cases, so not an absolute here :-).

This 'talent hunt' model then leads to modes of assessment that identify students with 'superior' knowledge and skills-- the problem being, that these superior skills turn out to be based as much on social capital as on actual ability. Students who competitively succeed are those who come from the dominant cultural group, are native to the language of instruction and that mode of thinking; whose parents have supplied them with the right computers, books and tutors to out-compete those with less capital. The students who don't have to work a nine to five job to put themselves through university; the students who don't have the family responsibilities of single parents; or of elder care; or of the extended families of first nations and immigrant populations.

To take one example: when I challenged one colleague in the hard sciences about the logic of including test items on material not actually covered in the curriculum or his lectures, he replied that he was looking for the exceptional student, the one who saw beyond mere course material. He pointed out several students who had done well on these untaught questions as proof they could be done by superior students. So I tracked down those students with him and asked them how they were able to answer the question --two said it was because they had taken prerequisites out of order and had in fact happened to have been taught the relevant skill in another course; and the third had had a private tutor who had anticipated the out of course knowledge from this instructor. Where the instructor thought he was evaluating 'superior intuitive knowledge and skill' he was in fact measuring 'luck'. Realizing that he was being 'tricked' in promoting the 'wrong' students as superior, and further hearing me say that he was alienating the other students from his discipline (which might not be a good long term strategy if he wanted an informed citizenry on his topic) by treating them as chaff, he stopped doing that and started assessing what he actually taught.

If one cannot demonstrate meaningful gaps between the top and bottom students on a bell curve, than what is the point of the curve other than to justify social inequalities. "I'm sorry you don't get to go on to next term, but you were ½ a mark lower than this other guy." It is ludicrous on the face of it! Our assessment instruments are not sufficiently accurate or reliable enough to allow these judgements.

If, on the other hand, you can demonstrate that the differences between the top and bottom of the bell curve in your class does n fact represent significant differences in competencies, then why not move to competency based assessment system and forget worrying about the curve? Of course, if one uses competency as the basis of grading, the question immediately becomes, why did students admitted to the program fail to achieve the minimum required competency to go on? I'm prepared to accept that for some students, its because student life in the bar/bed distracted them from studying to the point where they failed to live up to their potential. (My six 'D' students, for eg.) But otherwise, if they are good enough to be admitted, and then fail to acquire the skills one is hired to teach them-- something wrong with the instruction. (NOT, he hastened to clarify, necessarily the instructor, just somewhere in the curriculum, instruction, peer group, racism/sexism/etc mix, something is interfering in the outcomes promised to students in the admissions process.)

If the admissions process is designed to collect fees from 30% or more of the entering population for which officials have no provision for teaching in subsequent semesters, that's pretty much just straight fraud. I personally would not be okay in being complicit in that system. Further, telling students '⅔ of you won't be here next year' (or whatever) creates a violently competitive structure that intensifies discrimination (against women, minorities and any other criterion one can invent to promote oneself over others), intensifies cheating; undermines group work (though don't get me started on what else is wrong with group work!); and ensures the absence of any collaborative learning within the class. (Ensuring students are trying to undermine rather than help each other, instantly removes a significant element from the successful learning process.) Whatever the instructor is trying to do in the classroom, a competitive environment will undermine it. A competitive environment intensifies the problems that discriminate against students who don't come from the dominant population. It creates an environment that leaves so many students below their potential, that one can produce a population one feels comfortable in failing. (That's called a self-fulfilling prophecy.) So one needs to understand that the failure is an artefact of the system, not a force of nature.

I don't have any hesitation handing out Ds and Cs when they are deserved, but I'd rather work with students to help them get the skills and knowledge necessary to be successful in the class. (I can't remember giving out an 'F' for course work, but I do get the occasional student self-selecting out in first week who perhaps saw an 'F' in their future after reading the course outline)

I'm lucky that for most of my career, entrance requirements ran so high I never saw a student who couldn't have mastered the skills I was teaching; I sympathize with those caught in a system with massive first year intakes and low admission standards make it hard to succeed. But, um, agitate to change the system; don't buy into it as inevitable.

So, I would argue, it is indeed very important whether one "thinks it matters to me that there is and there is not a natural bell curve" It matters. It matters a lot whether it is ideology or something real.

Let us assume that the student body is varied with varied motivations and varied commitments, in fact let's just assume students have rich and complicated lives that precludes some students from always turning in their material, thus some students will not pass some of the assessments.

Here the author is assessing whether students have lives, rather than on their abilities. Nothing in "varied motivations and commitments" related to gender, class, ethnicity etc., eh?
Head:Desk.
So you're saying you're okay failing the single mom who missed an assignment because her child was sick, in contrast to the white male living in his mom's basement was able to get it in on time? That the guy deserves the better grade, will make the better graduate?

this is where the normal curve comes from.

That's what I'm arguing too, only my antecedent for 'this' is irrational biases rather than your presumption that 'this' signifies effort or talent.

you can play with a model and populate it with different assumptions, but as long as you let some students perform differentially across multiple modes of evaluation, it looks like a normal curve in large classes of 300 or more.

Or, you set up authentic assessments that represent the actual skills required and give students sufficient resources and time to achieve to that level.

so long as you allow for student agency and their capacities, you will get some variety of a normal outcome where some students will have over the course performed ostensibly always excellent, most will have performed around the C range, and some will have performed in the lower ranges.

Again, my argument is "their capacities" may not be talent and ability in the subject, but their ability to meet the logistical framework arbitrarily imposed upon them; that given their ability to meet the entrance requirements (unless these are fraudulently low) means that by definition they have the intellectual and academic capacity and that they are therefore being screened out by the artefact of the curve

that's all i'm saying, in large courses with designs the way we are supposed to design,

"we are supposed to design" What you are hearing on this list is people saying, "that's not how you are supposed to design it". It is a common design, for sure, and one supported by hegemonic ideology of meritocracy, but not all of us subscribe to that 'talent hunt' model. Will have to agree to disagree that this is an inevitable, mandatory, unchangeable design.

Can you design that away, i argue no, because students have agency. can you teach it away, i'd need proof because i've not seen it.

Will have to agree to disagree. Haven't seen a normal curve in my faculty in 20 years, and our grads have top reputation in their field.

Does this analysis work for small classes, no, it does not.

Well, we agree on that. And I certainly have to acknowledge that teaching large classes of 300 or more introduces a lot of logistical barriers to change, and bit facile for the rest of us to throw bricks.

Hey, thanks for standing up for your perspective. Feels a bit like we kind of ganged up on you there. My sincere apologies for wherever I let my enthusiasm get the better of me....I'm told I sound better in person than my in-print personality would tend to indicate.