Facing an urgent demand for access to the latest coronavirus research, scientists are speeding up the process of publicizing their findings, at the risk of enabling the spread of scientific misinformation.

In recent months, researchers studying the virus have increasingly opted to post their papers as soon as they are written, ahead of peer review and journal publication, to online repositories of academic articles known as “preprint servers.”

In one notorious instance, one study by a group of scientists at the Indian Institute of Technology Delhi in January fueled conspiracy theories that claimed the virus is a laboratory-manufactured bioweapon. The paper purported to identify an “uncanny similarity” between the new coronavirus and HIV, the virus that causes AIDS. 

Derek Lowe, author of Science magazine’s In the Pipeline blog on drug discovery, told me via email that, upon scrutiny, the paper “very quickly was found to be not able to justify its conclusions.”

“A decent journal would never have let it through,” he added.

The paper was retracted by its authors just two days after its release, but the initial report remains online and continues to fuel misinformation about the origins of the virus. Luc Montagnier, a past recipient of the Nobel Prize in Physiology or Medicine, has cited it repeatedly to support a conspiracy theory that the pathogen was manufactured by Chinese scientists developing an HIV vaccine.

The urgent need for research on the coronavirus is reflected by the sheer volume of new papers published since the outbreak of the pandemic. A combined 2,000 papers have been released to date on the two main preprint servers for biomedical sciences, bioRxiv — pronounced bio-archive — and medRxiv.

Yet while the servers have been an important resource in meeting the needs of scientists, policymakers and the public for information on the virus, their existence highlights a controversy in the academic world which predates the current pandemic.

The preprint wars

The practice of online preprint publishing sparked little debate when it first emerged some three decades ago in academic disciplines in which unvetted research presents less of a risk to public health than biology and medicine.

The first preprint server, arXiv, was created by Paul Ginsparg, an American physicist, in 1991 in order to enable the relatively small community of researchers in physics and mathematics with access to the early academic internet to share their work with one another. In 1994, a preprint server called the Social Sciences Research Network (SSRN), was established to offer a similar service for the social sciences and humanities.

But the biomedical sciences largely resisted the shift to preprint until the establishment in 2013 of bioRxiv, hosted at the Cold Spring Harbor Laboratory, a private research institution in New York. Even after bioRxiv was launched, biologists were initially slow to publish preprint papers, but after a few prominent scientists led the way, inspiring headlines like “Handful of Biologists Went Rogue and Published Directly to Internet,” the practice was gradually normalized.

In 2017, bioRxiv received an injection of funding from the Chan Zuckerberg Initiative, Facebook founder Mark Zuckerberg’s philanthropic organization to undergo an expansion which reportedly involved the development of new open-source software tools and the hiring of two or three full-time staff. In 2019, the organization behind bioRxiv set up medRxiv in order to provide a separate space for specifically medical research.

What followed was dramatically described by one paper as “The Preprint Wars.” Between 2013 and 2018, by one count, at least 18 new preprint servers were set up across various scientific disciplines. The shift toward acceptance of preprint servers involved the lifting by major journals of a longstanding principle according to which research that had already been publicized could not be published. This was the subject of heated debate within the scientific community.

Those driving the argument in favor of preprint servers reflected a wide cultural shift in the sciences known as the “open access” movement, which seeks to eliminate barriers to the accessibility of scholarly information online.

But others were vocal about their concerns that, even if it was helpful in facilitating communication between researchers, open access to unpublished articles would lead to poor reporting with negative consequences for public health.

One of the most prominent voices sounding a note of caution was the Science Media Centre (SMC), a UK organization that advises both scientists and journalists on how to provide the public with accurate information about science.

In 2018, Tom Sheldon, the SMC’s senior press manager, wrote an article in the journal Nature calling for a public reckoning in the scientific community about the risks of preprint.

“As someone who has worked for years with researchers and journalists to ensure responsible coverage of science in the media, I fear that this method of publication holds substantial risks for the broader community — risks that are not being given proper consideration by the champions of preprint. Weak work that hasn’t been reviewed could get overblown in the media. Conversely, better work could be ignored,” Sheldon wrote.

Fiona Fox, the head of the SMC, told me in an interview that she and her colleagues raised concerns because they were worried by the prospect of preprint studies on subjects such as vaccines and e-cigarettes reaching the public through journalistic coverage before peer review. “A lot of misleading stories that are threatening to public health obviously come from stories about medical research,” she said, citing as an example, the disproven but persistent theory of a link between the MMR vaccine and autism. 

Working with journals and universities, the SMC drew up a set of guidelines and asked scientists who published on preprint servers not to promote their findings via press releases until the paper was published by a journal. By and large, Fox said, universities followed these guidelines — until they didn’t.

“So, that was all lovely, that was all happy and that was all coming together nicely, and then Covid hits, and all the rules change,” she said.

Given the servers’ new position at the forefront of the scientific response to the pandemic, the SMC could no longer realistically advocate that preprints should not be publicized. In March, the organization began soliciting commentary from third-party experts on preprint studies in their field that were gaining significant attention in the media. Previously, this practice had been limited to peer-reviewed studies.

In some cases, the scientists providing comment expressed extreme reservations. One study conducted by Chinese scientists claimed to find evidence, based on patient records in Wuhan and Shenzhen, of a higher risk of infection associated with blood type A. In response, the SMC solicited comment by Sakthi Vaiyapuri, a professor of pharmacology at the University of Reading.

By the standards of staid scientific prose, Vaiyapuri’s dismissal was stinging. “There is little evidence to substantiate any claim that there is any more than a coincidental correlation between blood group ABO and susceptibility of contracting Covid-19. There are far too many parameters that cast doubt over the credibility of their claims,” he wrote.

And yet the study has received widespread media attention. According to analysis by Altmetric, a company that tracks the online impact of published research, it was picked up by 126 news outlets, 13 blogs and mentioned in tweets by 3,958 accounts. Some reports did not mention that the research was still in the preprint stage. One U.S. TV news report used the study in a fact-checking segment to verify the rumor that people with blood type A were more susceptible to the coronavirus as “true” based on “the preliminary evidence.”

One of the issues that factors into media coverage of preprints is that the journalists covering the coronavirus are not always science reporters. Fox told me that many of the people now reporting about preprint studies have been taken off their usual beats and “have no idea what peer review is and have no idea what a preprint is, and are having to cover this because there’s no other story in town.”

This plays into another problem posed by preprint servers: they are essentially dumps of information which require scientific expertise to adjudicate or contextualize. “Everything comes out as it’s received,” Lowe, the pharmaceutical blogger, told me. “There is no way to know what might be more interesting or important, and no way to find it other than by using keyword searches. It really puts people back on using their own judgement on everything at all times, and while that should always be a part of reading the literature, not everyone is able to do it well.”

Preprint servers do generally provide some moderation, commonly described as “sanity checks” — a minimal effort to verify the authors’ credentials and filter out plagiarism and obvious spam. If a paper fails to meet these standards, it is not published, but articles are not otherwise edited.

A flawed but useful tool

For all the potential problems associated with preprints, everyone I spoke to attested to their undeniable utility. Since “lives are on the line,” Maximilian Heimstädt, a researcher at the Berlin-based Weizenbaum Institute for the Networked Society, told me in an interview, “it’s very important that scientific results end up in the hands of decision-makers quickly — governments, hospitals and so on.” Heimstädt is the author of a recent article exploring the “unintended side effects of the current interest in preprints.”

In the absence of a formal review process, a kind of de facto real-time peer review has emerged in the comment sections of preprint studies, as well as in discussions on Twitter. These are precisely the places where large numbers of scientists gathered to discuss the flaws in the Indian study on similarities between the coronavirus and HIV before it was retracted. 

Critics of preprint must contend with the fact that even peer review is hardly a foolproof system. Elisabeth Bik, a Dutch microbiologist, was widely quoted in media reports last month after she drew attention to serious methodological problems, including potential violations of medical ethics, in a widely reported French study showing positive results for hydroxychloroquine trials, which U.S. President Donald Trump has publicized on Twitter. One red flag which Bik highlighted was the fact that the paper had been published in a journal just 24 hours after its appearance as a preprint on medRxiv, indicating an alarmingly rushed peer review process.

Bik, who authors the influential blog Science Integrity Digest, said in an email that “there have been some papers published after incredibly short peer review time windows, and some of those contain errors or misinformation that should have been caught during peer review.”

“If you want to do science right, you need to think and carefully perform a study, which takes a very long time,” Bik added. “The general audience and policy makers, however, want results fast, and often jump to conclusions based on very thin evidence.”

While the role of preprint servers in striking a crucial balance between speed and accuracy during the pandemic remains to be seen, Fox said researchers might look back on the use of preprints after the virus passes with optimism: “We may well celebrate preprint and say, ‘Aren’t we lucky that we had made such progress in encouraging scientists to use preprint just in time for this?’ I think there will be a lot to celebrate there.”