Last summer, I covered the saga of Harvard Business School’s Francesca Gino, who was credibly accused of flagrantly fabricating data in at least four of her published studies. She was caught when some data sleuths on the internet — investigating research misconduct in their free time — found discrepancies in the data for her papers and investigated further.
They eventually raised their concerns with Harvard, which investigated and ultimately requested retractions of the papers in question. (Gino filed a lawsuit against Harvard and the bloggers, accusing them of colluding to defame her.)
I kept thinking about Gino’s case as I read the uncannily similar story of a scandal at the Harvard-affiliated Dana-Farber Cancer Institute, a leading cancer research hospital in Boston.
Dana-Farber was rocked this January by a blog post by Sholto David, a molecular biologist and internet data sleuth, in which he presented evidence of widespread data manipulation in cancer research published by leading researchers including the institute’s CEO and COO. David reportedly contacted the institute with concerns about 57 papers, 38 of which were ones for which the institute had “primary responsibility for the potential data errors.” The institute has requested retractions for 6 of them and initiated corrections for 31.
These data manipulations, to be clear, were not subtle. (David’s fairly bombastic blog post announcing the evidence calls it “pathetically amateurish and excessive.”) Many of the cases he identifies involved reusing the same images over and over in different figures, with different labels, and with the figures having been clumsily rotated or stretched in Photoshop or a similar image editor. Plots of data collection on different days are mysteriously perfectly identical. Test results are visibly copied and pasted.
It raises the question: Assuming that there was some misconduct behind the copied-and-pasted images, how were people so emboldened to commit such blatant fraud, so publicly, for such a long time? How much grant money was secured on the basis of fabricated data, and how much was the crucial fight against cancer set back by inaccuracies promulgated in these papers?
And perhaps most importantly, is this only the tip of the iceberg?
Anatomy of a cancer data scandal
For years, biomedical researchers have been aware that the field has a problem with faked images in papers. In one 2016 paper, Dutch microbiologist Elisabeth Bik scanned more than 20,000 biomedical papers for evidence of such manipulation and found that 3.8 percent of papers had signs of it, “with at least half exhibiting features suggestive of deliberate manipulation.” Worse, the problem appears to be on the rise. “The prevalence of papers with problematic images has risen markedly during the past decade,” Bik found.
Her scale for describing manipulation examines three kinds of faked images — cases where the same image is used twice, with different labels (which could be an innocent error), cases where the same image is used twice but in one case deliberately cropped (which seems less likely to be an innocent error), and cases where an image has something else pasted over it (which seems very unlikely to be an innocent error).
So biomedical scientists were already well aware that the field had a problem. Some of the specific manipulations highlighted in David’s blog post were well-known among scientists, having been the subject of intense debate on paper discussion forum PubPeer. But while the concerns were well-known, it appears that it took David’s post to prompt retractions and an internal investigation.
Mistakes have consequences
It is troubling that cases like Gino’s and Dana-Farber’s required external data sleuthing to come to light. Being a data sleuth is deeply unrewarding, and even risky. David is currently unemployed and doing the work of flagging data manipulation in his free time between gigs, as he told the Guardian.
Many data sleuths have been threatened with lawsuits for exposing data fraud. “A lot of important science gets done not by big institutions questioning things but by independent people like this,” defamation lawyer Ken White told me last summer. The problem is that there’s no institutional process to review papers unless someone else brings problems to light — and most scientists don’t want to endanger their own careers to do that thankless, frustrating work.
It’s also troubling that the fakery was so blatant. We’re not talking about sophisticated data manipulation here — we are talking about cases where scientists badly photoshopped pictures of their experimental results. “We only see the tiny tip of the fraud iceberg — image data duplications, the last resort of a failed scientist after every other trick failed to provide the desired result,” David wrote in his original blog post. In a culture where photoshopping experimental results happens frequently, it’s unlikely to be the only form of manipulation.
There is another common thread between the Gino fiasco and the Dana-Farber one: Harvard University. Between Gino’s case, the resignation of Harvard president Claudine Gay, and now the alleged faked cancer research, Harvard’s reputation for academic excellence has undoubtedly taken a battering.
But the discovery of these challenges at America’s best-known prestige university has also served to bring public attention to an issue that badly needs it. Maybe Harvard’s embarrassment will spark change.
A version of this story originally appeared in the Future Perfect newsletter. Sign up here!