JOHN CRISP: Essay-grading software gets an 'F'
A New York Times headline last Friday, “Essay-Grading Software Offers Professors a Break,” struck an appealing chord with me. In my day job, I was facing a bleak weekend of grading: five sets of essays from my freshman writers. I could use a break.
Few tasks are more mind-numbing and energy-draining than reading and evaluating essay after essay by students who are trying to learn to improve their writing. A labor-saving computer program that performs this task has been a sort of Holy Grail for English teachers, and efforts to produce such a program date to the 1960s.
The results haven’t been encouraging. In fact, the history of attempts to take the labor out of grading papers is replete with stories of machines that gave high marks to nonsense and flunked the work of great essayists like George Orwell.
Now, the Times reports, EdX, a nonprofit entity founded by Harvard and the Massachusetts Institute of Technology, has developed software that “uses artificial intelligence to grade student essays and short written answers, freeing professors for other tasks.” EdX plans to make this boon to professors available, free of charge, to anyone who wants it.
Still, I share the skepticism of Les Perelman, a retired director of writing and a researcher at MIT. Perelman has mounted a petition campaign against machine scoring of student essays in high-stakes exams, arguing that a computer will never be able to “read” in the way that a human can. Computers can efficiently evaluate features like sentence and word length but are incapable of taking into account the subtleties of good writing — elements like, quoting Perelman’s petition, “accuracy, reasoning, adequacy of evidence, good sense, ethical stance” and so on.
In any case, the arduous task before me — and the prospect of free software that could perform it at the touch of an “Enter” key — provided an occasion to reflect on what happens when a teacher grades an essay and whether a computer could do the job just as well.
Think of essay grading as having two parts. The essential task of every grader is to assign an evaluation to a piece of work, usually a number or a letter. The mind can absorb and evaluate with considerable consistency all of the complicated and elusive elements that contribute to the effectiveness of one human’s attempt to communicate with another in writing. And a good grader can make this evaluation fairly quickly and efficiently.
Will a computer ever be able to do this? I’m doubtful, but history is filled with premature underestimations of what computers can do.
The other part of grading, however, isn’t grading, at all, but instruction — let’s call it “feedback” — and this is the part that computers are going to have trouble with. Assigning a number or letter grade is relatively easy compared with the time- and energy-consuming task of entering into a brief written dialogue with another writer about her ideas and the way she expresses them.
I’ve tried to find ways to use technology to streamline this process. If two students make the same mistake, why not produce a Word-processed response that can be pasted into the students’ texts as needed?
The problem is that soon such mechanical responses take on the hollow ring of modern voice-response systems, which try to make you think that you’re talking to a human being, but never quite succeed: “Um, I’m having trouble understanding what you said. Could you repeat it?”
I often remind my students that they’re smarter than computers, and they understand that writing involves communicating with other real human beings. The attraction of essay-grading software is obvious. It’s cost-effective and tireless. It doesn’t require an office or health insurance, and it isn’t concerned with job security. It’ll be interesting to see what kind of educational experience we choose to give to the modern student.
Now, back to grading, the old-fashioned way.