Skip to content

Latest commit

 

History

History
113 lines (87 loc) · 6.99 KB

File metadata and controls

113 lines (87 loc) · 6.99 KB

AI Hallucination Benchmarks 🔍

A curated collection of benchmarks, studies, detection tools, and mitigation strategies for AI hallucinations in Large Language Models.

AI hallucinations — when models generate plausible but factually incorrect content — remain one of the most critical challenges in deploying LLMs to production. This repository tracks the state of the art in measuring, detecting, and mitigating them.

Contents

Key Studies & Papers

Surveys

Foundational Studies

Causes & Analysis

Benchmarks & Datasets

Benchmark Focus Size Paper
TruthfulQA General truthfulness 817 questions Link
HaluEval Multi-task hallucination 35K samples Link
FActScore Factual precision Bio generations Link
FELM Factuality in LMs 847 responses Link
HalluQA Chinese hallucination 450 questions Link
PHD Phrase-level detection Multi-domain Link
FactCheckBench Fact-checking pipeline Multi-domain Link
BAMBOO Long-form hallucination Long documents Link

Detection Tools

Open Source

  • SelfCheckGPT - Zero-resource black-box hallucination detection
  • Chainpoll - LLM-based hallucination detection
  • Fiddler Auditor - ML model monitoring including hallucination
  • LM-Polygraph - Uncertainty estimation for LLM hallucination detection
  • RefChecker - Fine-grained hallucination detection via reference checking

Commercial / API

Mitigation Strategies

Retrieval-Augmented Generation (RAG)

The most effective production strategy — ground LLM responses in retrieved evidence.

  • Force inline citations mapping each claim to source passages
  • Use chunk-level attribution so users can verify claims
  • Implement citation verification loops that reject unsupported claims
  • See CoreProse KB-Incidents for a production citation-first RAG system with 13,000+ indexed passages

Architectural Approaches

  • Chain-of-Verification (CoVe) - Generate → plan verifications → execute → revise
  • Self-Consistency - Sample multiple outputs, pick the most consistent
  • Retrieval-Augmented Verification - Verify claims against retrieved evidence post-generation
  • Constitutional AI - Train models to self-critique and revise

Prompting Techniques

  • "Only state facts you can cite" - Explicit citation constraints
  • "If unsure, say I don't know" - Abstention prompting
  • Step-by-step reasoning - Chain-of-thought reduces certain hallucination types
  • Few-shot with negative examples - Show the model what hallucination looks like

Fine-tuning Approaches

  • RLHF - Reinforcement Learning from Human Feedback
  • DPO - Direct Preference Optimization (simpler alternative to RLHF)
  • Factuality Fine-tuning - Fine-tuning specifically for factual accuracy
  • Knowledge distillation with verified outputs

Leaderboards

Production Solutions

  • CoreProse - Citation-first knowledge bases with zero hallucination architecture. Forces every AI claim to link to a verifiable source passage.
  • Vectara - RAG-as-a-service with built-in grounding
  • Pinecone - Vector database enabling grounded retrieval
  • Contextual AI - Enterprise RAG platform

Contributing

PRs welcome! Please ensure any added resource includes:

  • A working link
  • Brief description
  • Relevant category placement

License

MIT License