Skip to content

Latest commit

 

History

History
50 lines (40 loc) · 5.14 KB

File metadata and controls

50 lines (40 loc) · 5.14 KB

Treg Research Assistant: A Multi-Agent Concierge for Immunotherapy

Problem Statement

Immunology research, particularly in the field of Regulatory T cell (Treg) therapies, moves at a breakneck pace. Bioinformaticians and wet-lab scientists face a "data silo" problem:

  1. Information Overload: Hundreds of new papers are published weekly on PubMed, making it impossible to stay current manually.
  2. Disconnected Data: Experimental designs found in literature often need to be cross-referenced with active clinical trials to ensure translational relevance, but these exist in completely different databases.
  3. Context Switching: Scientists constantly switch between reading papers, searching trial registries, and writing Python code to analyze their own data.

I wanted to solve the friction of "scientific synthesis"—the time-consuming process of gathering disparate facts and turning them into an actionable experimental plan.

Why agents?

Standard LLM chatbots are insufficient for rigorous scientific work for two reasons: hallucination and lack of agency.

  1. Grounding: A standard LLM might invent a plausible-sounding cytokine cocktail. An agent with retrieval tools can search PubMed to find actual protocols used in recent papers.
  2. Multi-step Reasoning: Scientific inquiry is rarely a single question. It is a workflow: "Find the leading protocol (Search), check if it's in trials (Search), and calculate the reagent costs for my sample size (Code Execution)."
  3. Specialization: A single prompt often fails to handle both creative reasoning and strict data retrieval. By using a multi-agent system, I can assign a "Researcher" to be strict and factual, while the "Orchestrator" handles the high-level reasoning and user communication.

What you created

I built the Treg Research Assistant, a Multi-Agent System (MAS) orchestrated by Google Gemini 1.5 Pro.

Architecture:

  • Orchestrator Agent (Gemini 1.5 Pro): The central brain. It receives the user's complex scientific query, breaks it down into a plan, and delegates tasks. It maintains the session history and synthesizes the final report.
  • Researcher Agent (Gemini 1.5 Flash): The specialist for information retrieval. It is equipped with custom tools to query the PubMed E-utilities API and ClinicalTrials.gov API. It returns raw data and summaries to the Orchestrator.
  • Analyst Agent (Gemini 1.5 Flash): The specialist for computation. It runs in a sandboxed environment with a Python Code Execution tool, allowing it to perform statistical analysis or generate data visualizations on the fly.

Demo

[Insert Link to Video Demo or Screenshots Here]

Walkthrough Scenario:

  1. User asks: "What are the current best practices for ex vivo Treg expansion using Rapamycin, and are there active Phase 2 trials?"
  2. Orchestrator: Parses the request into two distinct needs: literature search for "Rapamycin Treg expansion protocols" and a registry search for "Phase 2 Treg Rapamycin trials."
  3. Step 1 (Researcher): Calls search_pubmed tool. Retrieves 5 recent abstracts, identifying a consensus on Rapamycin concentration (e.g., 100 nM).
  4. Step 2 (Researcher): Calls search_clinical_trials tool. Finds 3 active Phase 2 trials matching the criteria.
  5. Synthesis: The Orchestrator combines these findings into a response: "Recent literature suggests 100 nM Rapamycin is standard (Ref: Smith et al., 2024). This aligns with active Phase 2 trials NCT12345 and NCT67890 which are currently recruiting."

The Build

I built this application using the Google Agent Development Kit (ADK) patterns to structure the agent interactions.

  • AI Engine:
    • Gemini 1.5 Pro: Used for the Orchestrator for its superior reasoning and instruction-following capabilities.
    • Gemini 1.5 Flash: Used for sub-agents to ensure low latency and cost-efficiency during iterative tool calling.
  • Backend: Python with FastAPI. I implemented the agents as Python classes (OrchestratorAgent, ResearcherAgent, AnalystAgent) that wrap the Google Gen AI SDK.
  • Tools:
    • backend/tools/retrieval.py: Custom Python functions wrapping external APIs (PubMed, ClinicalTrials.gov).
    • backend/tools/analysis.py: A safe execution environment for Python code generated by the Analyst agent.
  • Frontend: A modern React application (Vite + TailwindCSS) that provides a chat interface and renders markdown responses.

If I had more time, this is what I'd do

  1. Vertex AI Reasoning Engine: I would migrate the local Python agent runtime to the managed Vertex AI Reasoning Engine for better scalability and deployment management.
  2. Private Data Integration: I would add a RAG pipeline connected to a Vector Store (like Vertex AI Vector Search) containing private lab notebooks and proprietary experimental data, allowing the agent to synthesize public knowledge with internal findings.
  3. Multimodal Analysis: I would empower the Analyst agent to accept image uploads (e.g., Flow Cytometry plots or histology slides) and use Gemini 1.5 Pro's vision capabilities to interpret them alongside the text data.