A year ago, I argued on this site that our collective worry over grade inflation was not really warranted. It’s not that I think we should be giving A’s to all of our students. But, I wrote, if you work hard to grade your students fairly and end up with a top-heavy grade distribution, don’t sweat it. Sometimes your students really did perform well.
Rereading that piece — and the torrent of angry comments it inspired — I realized that part of my skepticism of the idea of grade-inflation-as-problem is a suspicion of grades as objective and clearly meaningful. The “case” against grade inflation assumes that a final grade offers an objective and transparent measurement of a student’s performance. For a grade to be seen as “inflated,” one has to assume that there’s a (lower) grade that the student was supposed to have earned.
Again, I think instructors need to take assessment very seriously, and should do their absolute best to evaluate students fairly. But we shouldn’t kid ourselves and believe that most faculty — especially those of us who cannot rely on multiple-choice exams — are able to exactly quantify a student’s performance over a semester. Grades are subjective by nature, influenced by all sorts of circumstances outside of the control of both student and instructor, not least the quality of other students’ work.
All of that said, insofar as I do inhabit the role of evaluator, I want to make sure that my grading is as fair as possible. Knowing that grades are indeed used to judge, reward, and punish students should make us feel duty-bound to grade them evenhandedly. To that end, I’ve often wondered how much my own unnoticed biases affect my grading. Are there steps I should be taking to make sure I’m not unconsciously favoring one kind of student over another?
In particular, I’ve long suspected that I should be grading student papers blind — with numbers replacing names — mimicking the process of peer review for journal articles. I thought I’d take a look at the research on the subject to see if such a practice is worth pursuing.
Should we all be grading blind?
The first article I looked at turned out to be the most conclusive. John Malouff, Ashley Emmerton, and Nicola Schutte — all researchers from Australia’s University of New England — set out to examine “the halo effect.” That is the hypothetical boost given to students who have performed well in the past. The halo effect says that if a student’s first essay garners an A, we are more likely to give a higher grade to that student’s subsequent work. In their 2013 article in Teaching of Psychology, the authors report on a study of 126 instructors who graded both a videotaped oral presentation and a short essay from a psychology student.
Half of the instructors were shown a poor presentation (one for which the student had very little time to prepare); the other half were shown a better presentation (produced after the student was given time and coaching). All of the study participants graded the same written work. The results were significant: Those who had seen the better oral presentation before marking the essay gave it a grade that was four points higher, on average, than did those who first saw the poor presentation. It makes a certain amount of sense: We categorize students, unconsciously or not, and generally expect them to perform at an established level.
However, another article by Malouff, published in 2008 in College Teaching, notes that “studies of grading bias in college and university instructors are almost nonexistent because of the difficulty of garnering actual instructors and testing them for bias without them seeing the point of the study, which could affect the way they grade.”
And indeed, the other research I found, after reading the 2013 halo effect study, was much more ambivalent about grading bias. Many studies seem to suffer from the sort of experimental design flaw noted by Malouff, while others returned mixed or insignificant results. A 2015 study by Phil Birch, John Batten, and Jo Batey— lecturers in sport and exercise science at the Universities of Chichester and Winchester —set out to examine the influence of student gender on grading but found no apparent difference between papers apparently written by male or female students. Likewise, a 2013 study carried out at a Dutch university by researchers Jan Feld, Nicolás Salamanca, and Daniel S. Hamermesh also found no bias along gender lines, and showed that instructors favored students of their own nationality without discriminating against those of other nationalities.
The more I spent time with this research, the less it seemed that grading bias was a problem we need to be worrying about. Perhaps the coming years will bring us more research like the halo effect study — research that shows there are biases that no amount of trying to be fair can overcome. But as it is, with the literature decidedly mixed, I personally am not ready to give up the pedagogical benefits of knowing who has written the essays I’m grading.
If I were to grade blind, I wouldn’t be able to chart a student’s progress throughout the term, from one assignment to another, nor would I be able to tailor my grading to the specific skills each student is working on. What’s more, I would need to abandon my practice of giving in-class feedback on students’ ideas for assignments. For instance, I would spoil students’ anonymity the moment I reviewed their thesis statements in advance. And students could no longer come to my office hours to discuss their papers; once they revealed to me the nature of their problem, my attempts to grade blind would be dashed.
For me, at least, grading is as much a tool for pedagogy as it is a tool of assessment. Each assignment is an opportunity for student learning, and our personalized feedback is a crucial part of that opportunity. We should strive to grade as fairly and as objectively as possible. But to make sure we do that wisely as well, we need to keep our eyes open. Otherwise, we might lose sight of the fact that we are still teachers when we grade, and the authors of the papers are still our students.