Production-grade multi-agent AI system for luxury chauffeur booking, powered by Semantic Kernel, Azure AI Search (RAG), and real-time SQL-based availability.
Bravado Solutions Enterprise software development company building scalable AI systems, SaaS platforms, and cloud-native applications.
Most AI systems fail because they stop at conversation. This project demonstrates a real-world agentic AI system that goes beyond chat to:
- Understand user intent
- Retrieve enterprise knowledge (RAG)
- Check real-time availability
- Execute bookings with transactional safety
- Maintain conversational memory
- Operate as a scalable API service
An AI Concierge capable of:
- Answering fleet, pricing, and service queries (RAG).
- Checking real-time vehicle availability.
- Booking chauffeur rides with atomic transactions.
- Maintaining session-based conversations.
- Persisting memory for context-aware responses.
This system follows an enterprise agentic architecture:
graph TD
%% User Layer
User((User)) -->|Booking / Info Request| API[FastAPI Orchestrator]
subgraph "The Brain: Agentic Core"
API --> SK[Semantic Kernel]
SK <-->|Reasoning Loop| GPT[Azure OpenAI GPT-4o]
SK <-->|Context Retrieval| Mem[Persistent Memory Store]
end
subgraph "The Hands: Plugin Layer"
SK --> Plugins{Function Dispatcher}
Plugins --> KP[Knowledge Plugin]
Plugins --> BP[Booking Plugin]
Plugins --> AP[Availability Plugin]
end
subgraph "Data & Knowledge"
KP -->|Semantic Search| AIS[Azure AI Search]
BP -->|SQL Transactions| DB[(SQLite Fleet DB)]
AIS ---|1M+ Docs| Docs[Fleet, Pricing, Events]
end
%% Response Flow
Plugins -->|Executed Action| SK
SK -->|Final Answer| API
API -->|Confirmation| User
%% Styling
style SK fill:#0078d4,stroke:#005a9e,color:#fff
style GPT fill:#107c10,stroke:#094a09,color:#fff
style Mem fill:#5c2d91,stroke:#3a1c5c,color:#fff
style DB fill:#f29111,stroke:#b36b08,color:#fff
- User Request β Initiated via FastAPI or Local CLI.
- Orchestrator (Semantic Kernel) β Plans response, analyzes intent, and invokes tools.
- Plugins (Tool Layer) β Knowledge (Azure AI Search), Availability (Fleet DB), and Booking logic.
- Memory Layer β Stores and retrieves past interactions for long-term context.
- Execution β Final response generated and action performed.
- Central Reasoning Engine: Powered by Semantic Kernel to coordinate the model and tools.
- Task Planning: Handles intent recognition and dynamic tool selection.
- Execution Loop: Manages the flow between the LLM and plugin results.
- Knowledge Plugin: High-speed RAG via Azure AI Search for fleet and policy queries.
- Availability Plugin: Real-time queries to the Fleet Database for vehicle stock.
- Booking Plugin: Transactional logic to secure rides and update SQL state.
- Persistent Storage: Interaction history stored via Azure AI Search Vector Store.
- Context Retention: Maintains user preferences across multiple sessions.
- Extensible: Supports Redis, Pinecone, or other vector providers.
- Framework: Built with FastAPI for high-concurrency async performance.
- Infrastructure: Includes session management, Redis rate limiting, and structured logging.
chauffeur-agentic-rag/
β
βββ main.py # CLI entry point for local testing
βββ app.py # FastAPI application entry point
βββ .env.example # Template for environment variables
βββ requirements.txt # Python dependencies
βββ Dockerfile # API container configuration
βββ docker-compose.yml # Multi-container orchestration (API + Redis)
β
βββ scripts/
β βββ init_fleet_db.py # Database schema and seed data setup
β
βββ kernel/
β βββ builder.py # Semantic Kernel initialization & configuration
β
βββ plugins/
β βββ knowledge_plugin.py # RAG & Azure AI Search logic
β βββ booking_plugin.py # Transactional ride booking operations
β βββ availability_plugin.py # Real-time fleet SQL queries
β
βββ services/
β βββ orchestrator.py # Core agentic reasoning & planning logic
β
βββ memory/
β βββ vector_store.py # Persistent context & vector search implementation
β
βββ utils/
βββ pii_utils.py # Data privacy and PII masking utilities
---
- Production-Ready: Focused on real-world workflows, not just chat-based demos.
- Modular Architecture: Pluggable tools and agents using Semantic Kernel.
- Scalable: API-first design containerized with Docker and Redis.
- Secure by Design: Strict environment isolation and PII masking for memory.
- LLM: Azure OpenAI (GPT-4o)
- RAG: Azure AI Search
- Orchestration: Semantic Kernel v1.x
- API: FastAPI / Uvicorn
- Throttling: Redis
- Database: SQLite (Persistent via Docker Volumes)
We help enterprises move from AI experimentation to production-grade intelligent systems. Our team specializes in Agentic RAG, Cloud-Native SaaS, and Enterprise AI Orchestration.
- Portfolio Case Study: Detailed Agentic AI System - Empire Limousine
π bravadosolutions.com
π§ contact@bravadosolutions.com
Built with β€οΈ by Bravado Solutions.