Skip to content

Shad107/ai-hallucination-benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

AI Hallucination Benchmarks 🔍

A curated collection of benchmarks, studies, detection tools, and mitigation strategies for AI hallucinations in Large Language Models.

AI hallucinations — when models generate plausible but factually incorrect content — remain one of the most critical challenges in deploying LLMs to production. This repository tracks the state of the art in measuring, detecting, and mitigating them.

Contents

Key Studies & Papers

Surveys

Foundational Studies

Causes & Analysis

Benchmarks & Datasets

Benchmark Focus Size Paper
TruthfulQA General truthfulness 817 questions Link
HaluEval Multi-task hallucination 35K samples Link
FActScore Factual precision Bio generations Link
FELM Factuality in LMs 847 responses Link
HalluQA Chinese hallucination 450 questions Link
PHD Phrase-level detection Multi-domain Link
FactCheckBench Fact-checking pipeline Multi-domain Link
BAMBOO Long-form hallucination Long documents Link

Detection Tools

Open Source

  • SelfCheckGPT - Zero-resource black-box hallucination detection
  • Chainpoll - LLM-based hallucination detection
  • Fiddler Auditor - ML model monitoring including hallucination
  • LM-Polygraph - Uncertainty estimation for LLM hallucination detection
  • RefChecker - Fine-grained hallucination detection via reference checking

Commercial / API

Mitigation Strategies

Retrieval-Augmented Generation (RAG)

The most effective production strategy — ground LLM responses in retrieved evidence.

  • Force inline citations mapping each claim to source passages
  • Use chunk-level attribution so users can verify claims
  • Implement citation verification loops that reject unsupported claims
  • See CoreProse KB-Incidents for a production citation-first RAG system with 13,000+ indexed passages

Architectural Approaches

  • Chain-of-Verification (CoVe) - Generate → plan verifications → execute → revise
  • Self-Consistency - Sample multiple outputs, pick the most consistent
  • Retrieval-Augmented Verification - Verify claims against retrieved evidence post-generation
  • Constitutional AI - Train models to self-critique and revise

Prompting Techniques

  • "Only state facts you can cite" - Explicit citation constraints
  • "If unsure, say I don't know" - Abstention prompting
  • Step-by-step reasoning - Chain-of-thought reduces certain hallucination types
  • Few-shot with negative examples - Show the model what hallucination looks like

Fine-tuning Approaches

  • RLHF - Reinforcement Learning from Human Feedback
  • DPO - Direct Preference Optimization (simpler alternative to RLHF)
  • Factuality Fine-tuning - Fine-tuning specifically for factual accuracy
  • Knowledge distillation with verified outputs

Leaderboards

Production Solutions

  • CoreProse - Citation-first knowledge bases with zero hallucination architecture. Forces every AI claim to link to a verifiable source passage.
  • Vectara - RAG-as-a-service with built-in grounding
  • Pinecone - Vector database enabling grounded retrieval
  • Contextual AI - Enterprise RAG platform

Contributing

PRs welcome! Please ensure any added resource includes:

  • A working link
  • Brief description
  • Relevant category placement

License

MIT License

About

Benchmarks, studies, detection tools, and mitigation strategies for AI hallucinations in Large Language Models.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors