AI Hallucinations Slip Into Elite Research Papers
AI hallucinations and fake citations are no longer theoretical problems—they’ve infiltrated the most prestigious research conference in artificial intelligence. A recent investigation uncovered over 100 fabricated references in peer-reviewed papers at NeurIPS 2025, exposing critical gaps in our academic safeguards and raising urgent questions about research integrity in the AI era.
What Happened at NeurIPS 2025
GPTZero, a Canadian AI detection startup, analyzed more than 4,000 research papers accepted at NeurIPS (Neural Information Processing Systems) 2025 and discovered a troubling pattern. At least 53 papers contained AI-generated hallucinated citations that had passed through multiple rounds of peer review (ℹ️ Fortune).
The hallucinations ranged from completely fabricated authors and paper titles to subtle manipulations of real citations—adding nonexistent coauthors, paraphrasing titles just enough to make them unverifiable, or creating believable-sounding chimeras combining elements from multiple real papers.
Edward Tian, GPTZero’s CEO, emphasized the significance: these weren’t just submissions under review. These papers beat out 15,000 competitors with NeurIPS’s 24.52% acceptance rate and were published in the final conference proceedings (ℹ️ Fortune).
How AI Creates Convincing Fakes
AI hallucinations and fake citations emerge from how large language models generate text. According to GPTZero’s investigation methodology, their tool searches academic databases and the open web to verify each citation element—authors, title, publication venue, and links.
The fabrications fell into distinct categories:
- Completely invented citations with fake authors like “John Smith and Jane Doe”
- Real papers with added fictional coauthors
- Accurate titles assigned to wrong journals with fabricated DOIs
- Subtle modifications expanding author initials into guessed first names
Alex Cui, GPTZero’s CTO, noted that sometimes AI adds five nonexistent authors to a real paper—mistakes no human would reasonably make (ℹ️ Fortune).
Why This Threatens Research Integrity
Citations serve as the foundation of scientific knowledge—they allow researchers to trace claims back to original sources, verify reproducibility, and build cumulative understanding. When citations point nowhere, the entire knowledge chain breaks.
In 2025, the main NeurIPS research track received 21,575 valid submissions, making deep scrutiny of every reference increasingly difficult (ℹ️ Fortune). The volume creates perfect conditions for fabricated citations to slip through.
The problem cascades: other researchers citing these 53 papers may propagate the hallucinations into their own work, creating chains of misinformation that undermine scientific validity.
Protecting Your Research From Hallucinations
As someone who values research integrity, we must adopt verification practices that match the AI tools we use. Here are essential steps:
Verify every citation manually. Even if an AI tool generated a reference that looks legitimate, check that the paper exists, the authors are correct, and the publication venue matches.
Use citation verification tools. GPTZero’s hallucination checker and similar tools can automatically flag suspicious references. The company reports accuracy above 99%, with human verification confirming flagged citations.
Budget verification time. For a paper with 50 citations, allocate 25 minutes specifically for citation checking—approximately 10-15% of your writing time.
Question convenience. If generating citations feels too easy, that’s your signal to verify more carefully. AI speeds up the process but cannot replace human judgment about source accuracy.
Understand tool limitations. Large language models are pattern-matching systems, not fact-checking systems. They generate plausible-sounding text based on training data, not verified truth.
What Scientific Community Must Do
The response from the NeurIPS board revealed a concerning level of complacency. The NeurIPS board stated that even if 1.1% of papers contain incorrect references due to LLMs, the content itself may not be invalid. This stance fails to recognize that citations are not mere details but rather fundamental components of academic trust.
During peer review, conferences need to make sure that citations are checked in a systematic way. ICLR 2026 has already hired GPTZero to check submissions for fabricated citations, setting a precedent other conferences should follow.
Academic institutions need clear policies on AI use in research writing, emphasizing that AI output requires the same verification as any other draft content.
Moving Forward Safely
AI hallucinations and fake citations represent a systemic challenge requiring cultural change, not just technical solutions. We must treat AI as a drafting tool that always requires human verification—never as a source of final truth.
The NeurIPS incident serves as a warning: convenience cannot justify breaking the knowledge chain that makes scientific research trustworthy. By budgeting verification time, using detection tools, and maintaining healthy skepticism toward AI-generated content, researchers can harness AI’s benefits while protecting research integrity.
The question isn’t whether to use AI in research—it’s whether we’ll build the verification practices necessary to use it responsibly.
About the Author
Nadia Chen, a specialist in AI ethics and digital safety, authored this article. Nadia specializes in helping non-technical users understand and navigate AI tools safely, with particular focus on protecting privacy and maintaining best practices in AI-assisted work.

