Citations in academic papers are intended to ground research in the work that preceded it, over time creating something of a family tree explaining the roots of ideas, protocols, and studies.
But a growing number of these citations lead to dead ends. “Fabricated” citations that do not reference real papers are spreading in the literature, polluting the public record of science, a new study published Thursday in the Lancet shows. Tools using generative AI are likely to blame, say the Columbia University researchers who authored the paper.
“One question that all of us have is: ‘Is AI making science more efficient, helping us do better work, or even the same work, but faster, or is it just creating slop?’ We have all of these papers coming out trying to track the use of [large language models] in science, but none of them really tell us anything about the quality. This is one of the first papers that’s telling us something about the quality of what’s being produced with LLMs, and it’s a signal of slop,” said Misha Teplitskiy, a sociologist of science at the University of Michigan who has studied citation practices and AI use among academics but was not involved with the paper.
It has become a common experience for academics to get alerted of new work citing their research, only to realize it is citing a paper that does not exist. Last year, a widely publicized “MAHA report” issued by the White House on its priorities in tackling chronic disease contained several incorrect citations that onlookers suspected were generated with artificial intelligence.
The new analysis itself was spurred by an unfortunate experience, said Maxim Topaz, a nurse and health AI researcher at Columbia who led the work. He used an AI chatbot to help edit an editorial he was submitting to a journal. Even after checking to make sure the citations were accurate, after several stages of edits, an editor at the journal flagged an erroneous citation.
“I was deeply embarrassed: I checked for that, and it still almost happened to me. This is how I ended up thinking about other people,” Topaz said. For the new analysis, he sifted through over 2 million papers and 97 million citations using AI tools. He identified around 4,000 fabricated citations among 2,800 papers. That number is quite low, Topaz and others said, but more worrisome is the fact that hallucinated citations are becoming more common. In 2023, 1 in 2,828 papers contained one or more fabricated references, but in 2025 that number had reached 1 in 458 — a sixfold increase in frequency. During the first seven weeks of 2026, the rate reached 1 in 277 papers.
The proliferation of fabricated citations on its own could impact systematic reviews or clinical guidelines that attempt to build off studies. But, it also reflects potentially deeper issues with those studies.
The presence of hallucinated citations “shows that there’s people who don’t even want to spend half an hour to check the references of a paper. It shows how fast they want to get published and how desperate they are to get published, which is a reflection of a flawed scholarly evaluation model that puts so much emphasis on peer reviewed publications,” said Mohammad Hosseini, a professor at Northwestern University who studies research integrity and the ethics of AI use. It also reflects a change in the culture of citations, from a genuine signal of a researcher engaging with a paper to a box-checking exercise.
“The more serious issue here is that citation practices are changing with the generative AI use. Previously you would read a paper and then make notes on it. Once you were writing a paper, you would be like, ‘Is that paper or book relevant? Is that paper not?’ It was a much more reflective process, whereas now people simply use their hunches to prompt ChatGPT or other AI tools, and then they have a bunch of citations that they can sprinkle over their papers, and that is not a healthy practice,” Hosseini said. “That means that the engagement with the literature is becoming increasingly more superficial, and that is neither good for the researcher, nor for society, nor for our publication practices.”
Fabricated citations are not spread evenly across journals. In a dashboard published alongside the paper, the researchers report that more than a third of fabricated citations come from two publishers. Topaz declined to name those publishers, but said they are large open-access publishers, which generate revenue by charging authors high prices to publish papers without a paywall.
STAT reached out to several prominent publishers of biomedical research to see how they have dealt with fabricated citations. The Science family of journals uses an automated tool to check references, and a spokesperson noted they have yet to have an issue with a published paper having a fabricated citation. Spokespeople for the New England Journal of Medicine and JAMA similarly responded that their journals have tools to validate citations, and noted that authors publishing in those journals agree to be held responsible for their work. The Public Library of Science said it has seen “numerous” unverifiable references in submissions.
“We are looking to incorporate this into our publishing workflows and have been exploring offerings in this space. In piloting options thus far, we have seen a lot of false positives in which legitimate references are flagged due to erroneous, incomplete or inaccurate information in reference lists, formatting issues, the source material’s publication language, or limitations of commonly used research literature databases,” said Renee Hoch, head of publication ethics at PLOS.
