Skip to content

mmaazkhanhere/debate-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Debate Arena: A Multi Agent, Web Grounded Debate Simulation System

AI Debate Arena is a full stack multi agent system that simulates structured debates between AI agents representing real-world personas. The system orchestrates autonomous agents, grounds their reasoning using live web search, and evaluates outcomes using a structured three-judge panel.

The project is designed as a production grade demonstration of:

  • Multi agent orchestration with CrewAI
  • Web grounded reasoning via DuckDuckGo
  • Real time event streaming with Server-Sent Events (SSE)
  • Structured debate evaluation
  • A card based strategic UI abstraction
  • Full Dockerized reproducibility

What This Project Demonstrates

This repository showcases a complete production style AI system, including:

  • Autonomous AI agents debating sequentially over a fixed number of rounds
  • Mandatory web grounding to incorporate current information
  • Redis backed streaming and concurrency control
  • Celery powered asynchronous execution
  • SQLite persistence for debate history and analytics
  • A modern Next.js frontend visualizing debates as tactical argument cards
  • A three agent judge panel scoring debates using structured rubrics

The entire stack runs locally using Docker Compose with minimal setup.


Card Based Debate Visualization

Instead of rendering debates as simple chat messages, the frontend abstracts each agent response into a structured argument card.

Each move includes:

  • A Move Type (Attack, Defense, Refute, etc.)
  • A Power Level representing argumentative strength
  • The Generated Argument
  • Web grounded supporting context
  • Structured judge evaluation

This approach transforms debate into a turn based strategic exchange rather than a basic chatbot conversation. It makes agent reasoning more interpretable, comparable, and engaging.


How the System Works

Debates follow a structured lifecycle:

  1. A user selects two debaters and a topic.
  2. The backend creates a debate session.
  3. Agents take sequential turns for a fixed number of rounds.
  4. Each turn is grounded using live DuckDuckGo search.
  5. Debate events are streamed in real time to the frontend.
  6. After final arguments, three independent judge agents evaluate performance.
  7. Results are stored and made available for analytics.

All agents use the same temperature configuration, and debates are time limited and deterministic in structure.


Architecture Overview

The system consists of the following components:

Frontend

  • Next.js App Router
  • Real-time SSE event consumption
  • XState orchestration
  • Card-based debate UI

Backend

  • FastAPI API server
  • CrewAI debate flow orchestration
  • Celery worker for asynchronous execution
  • Redis for broker, cache, locks, and streaming
  • SQLite for persistence

Infrastructure

  • Docker + Docker Compose
  • Separate dev and production configurations

Quick Start

The entire stack can be started locally with Docker.

git clone <repository-url>
cd <repository-root>
cp .env.example .env

Add required API keys to .env (for example, GROQ_API_KEY).

Then run:

docker compose up --build

Access the application at:

This will start:

  • FastAPI backend
  • Celery worker
  • Celery beat scheduler
  • Redis
  • Next.js frontend

No additional setup is required.

Example Debate

You can try:

Watch Elon Musk debate Donald Trump about DOGE (Department of Government Efficiency) performance and see a neutral three judge AI panel score them.

This example demonstrates:

  • Multi agent coordination
  • Real time streaming
  • Web grounded reasoning
  • Structured judge scoring

Repository Structure

backend/        # FastAPI + CrewAI orchestration
frontend/       # Next.js UI
docker-compose.yml
docker-compose.prod.yml

Each directory contains its own detailed README explaining service-specific configuration and development workflows.

Analytics & Persistence

The system tracks:

  • Debate duration
  • Token usage
  • Cost estimation
  • Judge scoring
  • Historical sessions

SQLite is used for local persistence. For production environments, a full relational database is recommended.

Security and Responsible Use

This system generates AI-simulated debates involving real public figures.

Important considerations:

  • Outputs are generated by large language models.
  • Content may contain inaccuracies or bias.
  • Statements do not represent real individuals.
  • The project is not affiliated with or endorsed by any real person referenced.
  • The system is intended for educational and demonstration purposes only.

API keys should never be committed to the repository. Production deployments should use secure secret management systems.

Production Considerations

For production-grade deployment, consider:

  • Replacing SQLite with PostgreSQL
  • Using managed Redis with persistence policies
  • Adding TLS termination and reverse proxy
  • Implementing proper authentication and authorization
  • Adding structured logging and monitoring
  • Implementing CI/CD pipelines

Contributing

Contributions are welcome.

If you would like to improve the system:

  • Create a feature branch.
  • Make focused commits.
  • Ensure Docker builds successfully.
  • Open a pull request with a clear description of changes.

Please maintain architectural consistency and avoid breaking changes to core orchestration flows without discussion.

About

Multi agent AI debate simulation system built with CrewAI, FastAPI, and Next.js. Agents debate real-world topics using live web grounding and are evaluated by a structured AI judge panel, visualized through a strategic card based UI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors