January 14, 2024:
Plagiarism accusations are being wielded like weapons right now — and the multi-headed plagiarism controversy involving Claudine Gay, Bill Ackman and his wife, and Business Insider is a particularly bizarre one.
It began with Gay, who stepped down from her position as Harvard’s president, ostensibly because critics found instances of (real) plagiarism in her work, but really because people didn’t like her congressional testimony on antisemitism at Harvard. Shortly thereafter, Business Insider published accusations of plagiarism against designer and former MIT professor Neri Oxman. Oxman is married to Bill Ackman, a major Harvard donor who vocally participated in a public campaign led by right-wing activists against Gay. Ackman, in response, announced that he would be launching his own plagiarism investigation into every person currently serving on MIT’s faculty, administration, and board.
[Related: The culture war came for Claudine Gay — and isn’t done yet]
Very few people involved in the mudslinging seem to cherish longstanding commitments to academic integrity, but they are more than willing to act as though they care about plagiarism a lot — or, alternatively, that plagiarism is no big deal — when it serves their political purposes.
As this latest battle of our neverending culture wars rages, it’s worth taking a step back and looking at some basic principles. Why is plagiarism a big deal? What does it mean to argue about it?
What even is plagiarism, anyway?
We’ll start with a basic working definition.
“Plagiarism is the use of someone else’s words or ideas without giving them credit,” says Susan Blum, an anthropology professor at Notre Dame and the author of My Word! Plagiarism and College Culture. “But when you actually operationalize, that’s where this slipperiness comes in.”
Most people agree that it’s straightforwardly plagiarism to copy and paste someone else’s work whole cloth and slap your own name on it. Most people also agree that it’s plagiarism to copy someone else’s sentences or phrases, whether we’re talking about a middle school essay, a doctoral dissertation, or a newspaper article.
But what happens if those phrases are clichés? What if they’re definitions? What if they’re widely accepted facts phrased in commonly used language? What if we’re not even talking about words but about a specific chord progression or a bit of software coding? It gets tricky fast.
“We all think we are talking about the same thing when we say the word, ‘plagiarism,’ but that isn’t necessarily the case,” writes Sarah Eaton in a blog post. Eaton is an education professor at the University of Calgary who studies academic ethics. “From my research, I can say with certainty that there is no singular or universally accepted definition of plagiarism.”
One of the biggest variations we see in how people talk about plagiarism comes from the different conventions in different disciplines within academia. Blum says that after she published My Word in 2009, academics in quantitative fields like engineering would tell her that it was common in their areas for people to plagiarize large chunks of their literature reviews. In these disciplines, what counted was the originality of your own research, not the originality of your summary of other people’s research.
Blum found this shocking. If a substantial part of someone’s work is expository, she says, “I would expect them — especially a professor — to follow the professional forms of citation.”
The distinction Blum’s engineer is making between plagiarizing your literature review, which he says doesn’t matter, and plagiarizing your research, which he says does matter, echoes a larger distinction between how academics think about plagiarism and how many others, including journalists, think about plagiarism.
In journalism, it’s common for outlets to report on the same story, and they don’t always credit the outlet that broke it in the first place. “You can’t claim to own the news,” says Rod Hicks, the director of ethics and diversity at the Society of Professional Journalists.
Hicks argues that, for a journalist, it’s hard to prove a plagiarism claim that doesn’t involve someone using your language verbatim. For an academic, on the other hand, plagiarism claims are most serious when they involve stealing other people’s research and ideas. For what it’s worth, that’s not what either Gay or Oxman have been accused of. Everyone agrees their ideas and research were original — it’s their words that weren’t.
Meanwhile, there’s also a widespread understanding that if you do enough nonfiction writing, you’ll end up with some sort of error of attribution somewhere in your work. Ackman, who called plagiarism “very serious” when talking about the charges against Gay, seemed to change his mind after his wife was accused of similar plagiarism.
“It is a near certainty that authors will miss some quotation marks and fail to properly cite or provide attribution for another author on at least a modest percentage of the pages of their papers,” Ackman posted on X. “The plagiarism of today can be best understood by comparison to spelling mistakes prior to the advent of spellcheck.” (In Ackman’s analogy, the new spellchecks are the AI filters that can read for plagiarism.)
“I worked as a proofreader for a long time, and I have never seen something published without errors,” says Blum. “There’s almost always some kind of error, especially in the bibliography. If you’re going to reduce all of professional writing ethics to something mechanical like this, you are bound to turn up a lot of instances of error.”
The fact that a certain number of errors are unavoidable does not mean that all academics accept the level of plagiarism Gay committed as normal. In an article for the Atlantic, Ian Bogost ran his own dissertation through iThenticate, one of the new AI plagiarism filters. The filter at first told Bogost that 74 percent of his dissertation was copied — but after Bogost went through each match in his similarity score, he found that most of them were from iThenticate comparing his dissertation to a book he wrote based on his dissertation. Once Bogost had eliminated the bogus errors, his similarity score went down to zero.
“Does this imply that Gay’s record is unusual among professors? Not in and of itself,” Bogost wrote. “But it does at least refute the case that this was nothing more than academic jaywalking, or, in its purest straw-man form, that everybody does it.”
Bogost is gesturing at one of the arguments that emerged on the left after Gay was accused of plagiarism: an argument over whether what Gay did was incredibly common and hence no big deal, or whether it was straightforward plagiarism that should be taken very seriously.
The split went all the way down to the sources from whom Gay copied. One of them, Gay’s old lab mate D. Stephen Voss, compared Gay’s infraction to “driving fifty-seven miles per hour on a fifty-five-mile-per-hour highway”: technically against the rules, but nothing so egregious that it deserves outsized punishment. Meanwhile, Carol Swain, whose work was also copied by Gay, publicly called for Gay to be fired and announced she was considering her legal options. “I don’t know what to make of the scores of black and white professors who have either redefined plagiarism or stated that Gay’s misappropriation of their work is fine and dandy with them,” Swain posted on X.
The debate here speaks to the murky way that the accusations against Gay emerged. Gay certainly copied from other people. But Christopher Rufo, the conservative activist who brought the accusations to light, is the same guy who stirred up the crusade against critical race theory, and he openly did so as part of a larger conservative battle against elite colleges. Under those circumstances, for the left to join the calls for Gay to step down could feel like playing into the hands of the right. On the other hand — well, she does seem to have plagiarized, whether you consider this case to be a technicality or not. So how do you handle that?
If history is our guide, the academy should respond in earnest. Blum points to the case of historian Doris Kearns Goodwin, who in 2002 was ousted from the Pulitzer board and from her position as a regular guest on PBS NewsHour over a plagiarism scandal. Goodwin blamed the problem on her habit of transcribing quotes out longhand from other sources and then getting confused when she assembled her notes into a book.
“She was found guilty of forgetting the quotation marks around quotations,” says Blum. “Because she was not following proper citation guidelines, she was punished. I mean, she’s rehabilitated, it’s not fatal. But it was tangible.”
It seems almost accidental that Rufo and his right-wing allies went with plagiarism as their weapon of choice.
“Any activist campaign has three points of leverage: reputational, financial and political,” Rufo explained in a Wall Street Journal op-ed. “For some institutions, one point of leverage is enough, but, for a powerful one such as Harvard, the ‘squeeze’ must work across multiple angles.” The plagiarism accusations were just leverage that happened to be particularly easy to acquire.
Plagiarism accusations are easier to come by now because of the rise of AI plagiarism detectors, which make it easy to comb through decades’ worth of text and compare it to a vast library of existing work. Ironically, those detectors themselves were built by what might be considered plagiarism. (“As far as I can tell, [AI is] just stealing,” Fran Lebowitz told Vox in October.)
We know for sure that Open AI’s ChatGPT was trained on a vast corpus that apparently includes pirated texts. Multiple high-profile authors have now sued Open AI for copyright infringement, including Jonathan Franzen and George R.R. Martin. In December, the New York Times sued OpenAI as well, arguing that ChatGPT is responsible for the “unlawful copying and use of The Times’s uniquely valuable works.”
This argument has persisted for a long time. In 2007, a group of students sued the early plagiarism detector Turnitin, alleging that it was plagiarizing their work. Turnitin, after all, works by archiving every student paper that’s uploaded to run through its filter, and then it charges schools for the use of that archive. The students argued — unsuccessfully — that Turnitin was making money from their intellectual property without their permission.
Blum says that every era has its own panic about how innovations are endangering intellectual property. “When I first started looking into plagiarism, there was a lot of stuff about how students didn’t have to go to the library anymore and copy things by hand. You could just scrape it off the internet and insert it,” she recalls. “There was a lot of discomfort about this new technology.”
Word processing and Google, a lethal combination, made language infinitely copyable and plagiarism incredibly easy to do, both intentionally and accidentally. Academia had to alter the way it thought about plagiarism to keep pace with the new tools. It developed new tools of its own, like Turnitin, and started spending more time on classroom conversations about how serious plagiarism is.
Today, one of the great innovations of AI’s large language models like ChatGPT is that they have made text into something not just copyable but synthesizable. The technology of the moment is manipulating texts in ways with which our current ethical frameworks are not built to reckon.
We don’t have precedents to tell us how to think about whether or not it is plagiarism to take every book ever written and use it to teach a neural network how to talk. We don’t have blueprints for dealing with what it means for someone to be able to go through your entire life’s work with a fine-tooth comb in a matter of days.
Our systems aren’t set up to deal with these problems, but these problems are also not going to go away. Our new tools are available to both good-faith and bad-faith actors, and that means we are at the beginning of a very messy new era indeed.