Skip to content

zchoi/Awesome-Weak-to-Strong-Generalization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š Awesome Weak-to-Strong Generalization

image

A curated collection of papers, resources, and insights on **Weak-to-Strong Generalization (W2SG)** in large models.

🧠 What is Weak-to-Strong Generalization?

Weak-to-Strong Generalization studies the phenomenon where:

A strong model trained on weak supervision can match or even surpass the performance of the weak source.

This paradigm appears across:

  • Large Language Models (LLMs)
  • Multimodal Models (VLMs, Video Models)
  • Alignment & Preference Learning
  • Agent-based Systems

πŸš€ Why It Matters

  • ❌ Real-world supervision is often noisy
  • ❌ High-quality labels are expensive
  • ❌ Weak signals are cheap and scalable

πŸ‘‰ W2SG provides a new scaling axis:
improving performance without improving supervision quality


πŸ—‚οΈ Paper Taxonomy

πŸ“Š Representative Papers

Paper Venue Setting Weak Source Strong Gain
Weak-to-Strong Generalization: Eliciting Strong Capabilities with Weak Supervision ICLR 2024 Alignment Weak reward model βœ”
Self-Improving Language Models ICML 2023 Self-training Model itself βœ”
RLAIF: Learning from AI Feedback NeurIPS 2023 Alignment AI feedback βœ”
Direct Preference Optimization (DPO) NeurIPS 2023 Alignment Weak preferences βœ”
Constitutional AI NeurIPS 2023 Alignment Rule-based feedback βœ”
Distilling Step-by-Step Reasoning NeurIPS 2022 Reasoning Weak CoT βœ”
Teaching Small Models to Reason arXiv Reasoning Large model CoT βœ”
STaR: Bootstrapping Reasoning NeurIPS 2022 Self-training Self-generated CoT βœ”
Self-Consistency Improves CoT ICLR 2023 Reasoning Multiple weak paths βœ”
Noisy Student Training CVPR 2020 Vision Noisy pseudo-labels βœ”
Pseudo-Labeling for Semi-Supervised Learning ICML SSL Weak labels βœ”
FixMatch NeurIPS 2020 SSL Augmented weak labels βœ”
Knowledge Distillation NeurIPS 2015 Distillation Teacher logits βœ–
Born-Again Neural Networks ICML 2018 Distillation Same architecture βœ”
When Does Student Surpass Teacher? ICML Distillation Weak teacher βœ”
VideoCoCa / Flamingo-style works NeurIPS Multimodal Weak alignment βœ”
BLIP / BLIP-2 ICML 2023 Multimodal Noisy captions βœ”
LLaVA NeurIPS 2024 Multimodal GPT-generated data βœ”
Voyager (Minecraft Agent) NeurIPS 2023 Agent Weak exploration βœ”
ReAct ICLR 2023 Agent Prompted reasoning βœ”
Reflexion NeurIPS 2023 Agent Self-feedback βœ”

We organize the literature into the following categories:


1. πŸ”Ή Weak-to-Strong Alignment

  • Learning from weak preference signals
  • Reward modeling with imperfect annotators
  • DPO / RLHF under weak supervision

2. πŸ”Ή Self-Training & Bootstrapping

  • Pseudo-labeling
  • Iterative self-improvement
  • Teacher-student refinement loops

3. πŸ”Ή Distillation Beyond Teacher

  • When student > teacher
  • Capacity vs supervision mismatch
  • Knowledge reconstruction

4. πŸ”Ή Weak CoT β†’ Strong Reasoning

  • Learning reasoning from weak CoT
  • Implicit reasoning recovery
  • Latent structure induction

5. πŸ”Ή Multimodal Weak Supervision

  • Weak labels for video understanding
  • Noisy grounding signals
  • Cross-modal alignment

6. πŸ”Ή Agent & Decision-Making

  • Weak planners β†’ strong policies
  • Learning from suboptimal trajectories
  • Preference learning in interactive systems

πŸ“„ Paper List

⭐ Key Papers

  • Weak-to-Strong Generalization: Eliciting Strong Capabilities with Weak Supervision
  • Self-Improving Language Models
  • Distilling Step-by-Step Reasoning
  • RLAIF / DPO related works

πŸ“š Full Collection

(ζŒη»­ζ›΄ζ–°οΌŒζ¬’θΏŽ PR)

2025

  • Paper A
  • Paper B

2024

  • Paper C
  • Paper D

🧩 Key Insights

  • Strong models rely on pretrained priors
  • Weak supervision contains partial structure
  • Iterative refinement is critical
  • Overfitting to weak signals is a major failure mode

πŸ“Š Open Problems

  • ❗ When does weak-to-strong fail?
  • ❗ How to measure beyond-teacher generalization?
  • ❗ Robustness under biased weak signals
  • ❗ Scaling laws for weak supervision

πŸ› οΈ Related Resources

  • Benchmarks
  • Codebases
  • Datasets

🀝 Contributing

We welcome:

  • πŸ“„ New papers
  • 🧩 Taxonomy improvements
  • πŸ“Š Benchmarks / repos

Please submit a PR!


⭐ Star History

If you find this repo useful, consider giving it a star ⭐


πŸ“œ License

MIT


πŸ”₯ Maintainer

  • Haonan Zhang

πŸ’‘ Philosophy

Weak supervision is not just noisyβ€”it is compressed knowledge.

About

πŸ”₯πŸ”₯πŸ”₯ This repository curates research on Weak-to-Strong Generalization across LLMs, multimodal learning, and beyond, focusing on how strong models learn from weak supervision and surpass their teachers. Stay tuned for the latest updates!

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors