Quantcast

Software Aims To Replace Human Essay Graders

April 26, 2012

A new piece of software designed to automatically read and grade student essays may be able to do just as good of a job as humans, according to a new study.

The study ran over 16,000 essays from middle school and high school tests through automated systems developed by nine companies.

The essays came from six different states and had originally been graded by a human, and the researchers set out to find whether a computer using software would be comparable to the human grading.

The systems have things they look for in order to establish an essay grade, such as sentence structure, syntax, word usage and subject-verb agreements.

According to the University of Akron, the computer scoring produced “virtually identical levels of accuracy,” and some cases proved that the computers were more reliable than a human.

David Williamson, a research director for E.T.S., told The New York Times that the automated reader developed by the Educational Testing Service, e-Rater, can grade 16,000 essays in 20 seconds.

However, Les Perelman, a director of writing at the Massachusetts Institute of Technology (MIT), said that fooling the e-Rater is not difficult.

He told the Times that the software is unable to identify truth, so students would be able to make up facts in their essays, as long as the sentence structure is accurate.

Perelman said the system also prefers long essays.  He wrote a 716-word essay riddled with nonsensical sentences that received a top score of 6, compared to a factual essay of 567 words that scored a 5.

“Once you understand e-Rater´s biases it´s not hard to raise your test score,” Perelman told the Times.

He found that the software did not like short paragraphs or sentences, but it did prefer those who wrote with a big word.

He said that the substance of an argument does not matter, as long as it looks like the subject is being argued correctly.

Officials for the E.T.S. software said that Perelman’s test prep advice is too complex, and if they can grasp it, they are using the higher level of thinking anyways, so they deserve the higher grade.

They claim that Perelman is setting a false premise when he treats e-Rater as if it is supposed to substitute for human scorers.  They said that when the grading is at high stakes, they are also scored by a human.

Peter Foltz, vice president of for-profit education company Pearson, said 90 percent of the time, Pearson’s Intelligent Essay Assessor is used by classroom teachers as a learning aid.  He told the Times that students will use the software to improve their writing skills before submitting it to a teacher.


Source: RedOrbit Staff & Wire Reports



comments powered by Disqus