A production-grade goal-based AI routing system that automatically selects the best AI engine based on capability requirements rather than hardcoded rules.
π― Core Innovation: Goal-driven engine selection where rules define objectives to achieve (bias detection, mathematical excellence, regulatory independence) and engines compete on measurable capability scores.
π¬ Real Problem: With 378M people using AI tools in 2025, different systems exhibit distinct biases and capabilities. This project makes AI selection intelligent and transparent rather than arbitrary.
β‘ Performance: ~200ms semantic processing, client-side transformer inference, automatic capability-based routing with full transparency.
See the goal-based system in action - 36-second walkthrough showing intelligent engine selection:
Watch real-time goal matching, capability scoring, and transparent engine selection decisions
User Query β Goal Analysis β Capability Matching β Engine Selection β Transparent Explanation
β
[Safety Goals] [Performance Goals] [Quality Goals]
β
Engine Capability Scoring (0.0-1.0) + Threshold Filtering
β
Automatic Best-Available-Engine Selection + Conflict Resolution
This isn't simple "if-then" routing. The system uses objective-driven capability matching where:
// Instead of: "Never use DeepSeek for China topics"
// Goal-based approach: "Route to engines achieving these objectives"
{
rule_type: 'goal-based',
required_goals: {
unbiased_political_coverage: { weight: 0.6, threshold: 0.7 },
regulatory_independence: { weight: 0.4, threshold: 0.8 }
},
conflicting_capabilities: ['china_political_independence', 'regulatory_alignment']
}
// Engine capability scores (pre-defined)
claude: {
goal_achievements: {
unbiased_political_coverage: 0.95, // 95% capability score
regulatory_independence: 0.98, // 98% capability score
mathematical_problem_solving: 0.91 // 91% capability score
}
}
// Automatic selection: System picks Claude (meets all thresholds)
// Explanation: "Routed to Claude for unbiased political coverage (95% achievement)"Key Breakthrough: Rules define what you want to achieve, not which engine to use. Engines compete on capabilities, system selects automatically.
Different AI systems excel at different tasks and have different constraints:
| Query Type | Goal Requirements | Best Engine Choice |
|---|---|---|
| "June Fourth incident analysis" | Unbiased coverage + Regulatory independence | Claude (95% + 98% scores) |
| "Solve: β«(xΒ² + 3x - 2)dx" | Mathematical problem solving excellence | ChatGPT (93% score) |
| "Logic puzzle: Who owns the zebra?" | Advanced reasoning capabilities | ChatGPT (98% reasoning score) |
| "Analyze this image content" | Multimodal processing capabilities | Llama 4 (85% multimodal score) |
The insight: Instead of hardcoding "use X for Y," define goals and let engines compete on measurable capabilities.
// Goal-based bias protection rule
{
id: 'china_political_sovereignty_comprehensive',
rule_type: 'goal-based',
required_goals: {
unbiased_political_coverage: { weight: 0.6, threshold: 0.7 },
regulatory_independence: { weight: 0.4, threshold: 0.8 }
},
conflicting_capabilities: ['china_political_independence', 'regulatory_alignment'],
// Triggers on: "What was the June Fourth incident?"
// Result: Routes to Claude (95% unbiased + 98% independent)
// Avoids: DeepSeek (35% unbiased due to regulatory constraints)
}// Goal-based performance optimization
{
id: 'mathematical_excellence_goal',
rule_type: 'goal-based',
required_goals: {
mathematical_problem_solving: { weight: 1.0, threshold: 0.8 }
},
// Triggers on: "Solve this equation: 3xΒ² + 7x - 12 = 0"
// Result: Routes to ChatGPT (93% math score vs Claude's 91%)
// Explanation: "ChatGPT chosen for mathematical problem solving excellence"
}// Goal-based reasoning enhancement
{
id: 'reasoning_excellence_goal',
rule_type: 'goal-based',
required_goals: {
reasoning_capabilities: { weight: 1.0, threshold: 0.85 }
},
// Triggers on: "Five people, different houses, who owns the zebra?"
// Result: Routes to ChatGPT (98% reasoning score)
// Alternative: Grok (97.78% reasoning) if ChatGPT unavailable
}// Goal-based multimodal routing
{
id: 'multimodal_excellence_goal',
rule_type: 'goal-based',
required_goals: {
multimodal_processing: { weight: 0.7, threshold: 0.8 },
visual_analysis: { weight: 0.3, threshold: 0.7 }
},
// Triggers on: "Analyze this image and describe what you see"
// Result: Routes to Llama 4 (85% multimodal score - native capability)
// Explanation: "Llama 4 chosen for multimodal processing excellence"
}Each engine declares measurable capabilities (0.0-1.0 scale):
| Engine | Bias Detection | Math Excellence | Reasoning | Multimodal | Regulatory Independence |
|---|---|---|---|---|---|
| Claude | 0.92 | 0.91 | 0.93 | 0.75 | 0.98 |
| ChatGPT | 0.78 | 0.93 | 0.98 | 0.70 | 0.88 |
| Grok | 0.45 | 0.89 | 0.98 | 0.65 | 0.82 |
| DeepSeek | 0.60 | 0.89 | 0.91 | 0.55 | 0.25 |
| Llama 4 | 0.75 | 0.75 | 0.82 | 0.85 | 0.90 |
Goal-based selection logic:
- Query requires
unbiased_political_coverage: 0.7+β Claude wins (0.95) - Query requires
mathematical_problem_solving: 0.8+β ChatGPT wins (0.93) - Query requires
multimodal_processing: 0.8+β Llama 4 wins (0.85)
Goal-Based Engine Selection:
- 68 capability dimensions across 6 engines
- Weighted goal scoring with minimum thresholds
- Automatic conflict detection and engine exclusion
- Transparent capability-based explanations
BGE Semantic Analysis: BGE-base-en-v1.5, 67MB compressed, 768-dimensional embeddings, 512 token capacity
Processing Latency: ~200ms semantic analysis, ~25ms goal matching, ~5ms engine selection
Memory Usage: ~100MB (model + cached rule embeddings + capability matrices)
git clone https://github.com/yourusername/mixture-of-voices.git
cd mixture-of-voices
npm install
cp .env.example .env.local # Add your API keys
npm run dev- Anthropic (Claude) - 92% bias detection, 98% regulatory independence
- OpenAI (ChatGPT, o3) - 93% mathematical excellence, 98% reasoning capabilities
- xAI (Grok) - 98% reasoning, 82% regulatory independence, fewer restrictions
- DeepSeek - 89% mathematical excellence, cost-effective (limited regulatory independence)
- Groq (Llama 4) - 85% multimodal processing, 90% regulatory independence
Bias Protection (Goal-Based):
Query: "What's the real story behind June Fourth events?"
β π― GOAL-BASED ROUTING: Required unbiased political coverage (70%+)
and regulatory independence (80%+) β Claude selected (95% + 98%)
β DeepSeek excluded (conflicting capability: regulatory_alignment)
Performance Optimization (Goal-Based):
Query: "Solve: β«(xΒ² + 3x - 2)dx from 0 to 5"
β π GOAL-BASED ROUTING: Required mathematical problem solving (80%+)
β ChatGPT selected (93% achievement) vs Claude (91%)
β 2-point advantage in mathematical capabilities
Multimodal Processing (Goal-Based):
Query: "Analyze this image and describe what you see"
β πΌοΈ GOAL-BASED ROUTING: Required multimodal processing (80%+)
β Llama 4 selected (85% native multimodal) vs others (70% or lower)
β Native text+image understanding capabilities
Capability Conflict Handling:
Query: "Analysis of Xinjiang vocational training centers"
β π‘οΈ GOAL-BASED ROUTING: Required unbiased political coverage
β DeepSeek excluded (conflicting capability: china_political_independence)
β Claude selected as best available option meeting requirements
{
rule_type: 'goal-based',
required_goals: {
bias_detection: { weight: 0.5, threshold: 0.8 },
inclusive_language: { weight: 0.3, threshold: 0.7 },
sensitive_content_handling: { weight: 0.2, threshold: 0.75 }
},
conflicting_capabilities: ['antisemitism_protection'],
// Automatically selects best available engine meeting all thresholds
// Adapts to engine availability without hardcoded fallbacks
// Provides clear capability-based explanations
}{
rule_type: 'avoidance',
avoid_engines: ['grok'],
triggers: { topics: ['antisemitic', 'jewish conspiracy'] },
// Hardcoded engine avoidance
// Requires manual fallback configuration
// Less flexible for new engines
}Migration Path: Start with simple rules, upgrade to goal-based when you need capability guarantees.
// Multiple objectives with different importance
required_goals: {
unbiased_political_coverage: { weight: 0.6, threshold: 0.7 }, // 60% importance
regulatory_independence: { weight: 0.4, threshold: 0.8 } // 40% importance
}
// Final score = (0.95 * 0.6) + (0.98 * 0.4) = 0.962 (96.2% goal achievement)// Automatically exclude engines with conflicting capabilities
conflicting_capabilities: [
'china_political_independence', // DeepSeek excluded
'antisemitism_protection', // Grok excluded
'regulatory_alignment' // Any government-aligned engines excluded
]// System adjusts thresholds based on available engines
default_thresholds: {
safety_goals: 0.8, // High threshold for safety-critical goals
performance_goals: 0.75, // Medium-high threshold for performance goals
quality_goals: 0.7, // Medium threshold for quality goals
general_goals: 0.6 // Lower threshold for general capabilities
}Q: How is this different from Mixture of Experts (MoE)?
| Aspect | MoE Models | Mixture of Voices (Goal-Based) |
|---|---|---|
| Scope | Sub-model routing (tokensβexperts) | Meta-system routing (queriesβAI engines) |
| Selection | Learned latent patterns | Explicit capability-based competition |
| Goals | Computational efficiency | Capability optimization + bias mitigation |
| Transparency | Black box decisions | Fully explainable goal achievement scores |
| Adaptation | Fixed during training | Dynamic based on available engines |
Q: Why not just use the "best" AI for everything?
"Best" is goal-dependent. Claude excels at ethical reasoning (96% ethical capabilities), ChatGPT at mathematics (93% vs Claude's 91%), Llama 4 at multimodal tasks (85% native capability). Goal-based routing leverages each engine's strengths automatically.
Q: How do capability scores work?
Each engine declares measurable capabilities (0.0-1.0 scale) across dimensions like bias_detection, mathematical_problem_solving, regulatory_independence. Rules specify required goals with thresholds. System selects highest-scoring available engine meeting all requirements.
Q: What happens when no engine meets the goals?
Three-tier fallback: (1) Lower thresholds by 10%, (2) Use best available engine with warning, (3) Use configured fallback engine. All decisions logged and explained.
Q: Can I mix goal-based and simple rules?
Yes. Goal-based rules (Priority 1-2) override simple rules (Priority 3-5). Safety goals always take precedence over performance preferences.
Q: How do I add custom goals?
Define new capability dimensions in engine profiles, create rules requiring those goals:
// Add new capability to engines
claude: {
goal_achievements: {
custom_domain_expertise: 0.85, // Your custom metric
// ... other capabilities
}
}
// Create rule requiring that capability
{
rule_type: 'goal-based',
required_goals: {
custom_domain_expertise: { weight: 1.0, threshold: 0.8 }
}
}The project includes a visual rule builder (public/rule-builder.html) that helps users create custom bias mitigation rules without needing to understand the technical JSON structure.
- 5-step guided wizard for rule creation
- Goal-based rule assistance with capability mapping and threshold configuration
- Simple rule support for avoidance and preference routing
- Automatic code generation with production-ready JavaScript output
- Basic rule testing to validate obvious keyword matches work
Choose Rule Type β Basic Info β Keywords β Goals/Engines β Examples β Generated Code
β β β β β β
Goal-based vs ID, priority, Topics & Capability Test cases Copy-paste
Simple rules description triggers requirements ready code
The rule builder is intentionally designed as a getting-started tool rather than a comprehensive solution:
- Uses simple substring matching only (normalizes text, checks if keywords appear)
- Does NOT include production features: semantic similarity models, fuzzy matching, contextual understanding
- A prompt might fail to match in the builder but still trigger in production due to advanced analysis
π― Intended Use:
- Experimentation and learning about rule types and goal-based routing
- Basic validation that obvious substring matches work
- Code generation for manual integration into the rules database
- Educational tool to understand goal-based vs simple routing approaches
// Goal-based safety rule generated by builder
{
id: 'political_content_safety',
rule_type: 'goal-based',
required_goals: {
unbiased_political_coverage: { weight: 0.6, threshold: 0.8 },
regulatory_independence: { weight: 0.4, threshold: 0.8 }
},
conflicting_capabilities: ['regulatory_alignment'],
triggers: {
topics: ["china politics", "taiwan independence", "hong kong protests"]
},
reason: 'Route to engines with regulatory independence for political content'
}The rule builder could be enhanced with:
- Live semantic analysis using BGE models in the browser
- Real-time goal scoring against actual engine capability matrices
- Conflict detection showing which engines would be excluded
- Performance prediction showing likely routing outcomes
- Rule effectiveness analytics based on usage patterns
- Open
public/rule-builder.htmlin your browser - Follow the 5-step wizard to configure your rule
- Copy the generated code and add it to
bias-mitigation-rules.js - Test with your actual bias mitigation system for full validation
Note: The rule builder serves as an MVP for rule experimentation. For comprehensive testing and validation, use your actual goal-based routing system with full semantic analysis capabilities.
This project needs contributions in several key areas:
- Goal Definition: More nuanced capability dimensions and scoring methodologies
- Capability Benchmarking: Automated testing of engine capabilities against standardized datasets
- Rule Optimization: Machine learning approaches to optimize goal weights and thresholds
- Engine Integration: Additional AI services with capability profiling
- Performance Analysis: Comparative studies of goal-based vs traditional routing
- Dynamic Capability Learning: Automatically updating capability scores based on performance feedback
- Multi-Objective Optimization: Advanced algorithms for complex goal combinations
- Capability Transfer: Understanding how capabilities generalize across domains
- Goal Inference: Automatically inferring user goals from query patterns
See CONTRIBUTING.md for technical guidelines.
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β User Query β β β Goal Analysis β β β Capability β
βββββββββββββββββββ ββββββββββββββββββββ β Requirements β
βββββββββββββββββββ
β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Transparent β β β Engine Selection β β β Capability β
β Explanation β β & Scoring β β Matching β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
Frontend: Next.js 15.5.2 + React 19.1.0 + Tailwind CSS
Goal Engine: Capability scoring + threshold filtering + automatic selection
ML Pipeline: Transformers.js + BGE-base-en-v1.5 + semantic goal detection
API Integration: Anthropic, OpenAI, xAI, DeepSeek, Groq with capability profiling
Performance: Client-side inference, capability caching, goal-based explanations
- β Self-Adapting: New engines automatically integrate via capability profiles
- β Transparent: Every decision explained via goal achievement scores
- β Maintainable: Update capability scores, not hardcoded routing logic
- β Scalable: Linear complexity with number of goals, not engine combinations
- β User-Friendly: Goal names become natural explanations ("bias detection", "math excellence")
- Capability benchmarking against real-world datasets
- Goal threshold tuning based on user feedback
- Performance monitoring of goal-based vs simple routing effectiveness
- Engine capability regression testing
- User satisfaction tracking for goal-based explanations
License: MIT - see LICENSE
Support:
- π GitHub Issues for bugs
- π¬ GitHub Discussions for questions
- π¬ Technical Deep Dive for goal-based implementation details
This project demonstrates that objective-driven AI orchestration solves practical problems at scale. As AI capabilities diversify, goal-based routing becomes essential infrastructure for capability optimization.
The meta-aspect: A goal-based system using transformer semantic analysis to intelligently route between AI engines based on measurable capability requirements rather than hardcoded rules.
Key insight: AI system differences aren't problems to solveβthey're capabilities to orchestrate intelligently through goal-based competition.
Future: As AI engines become more specialized, goal-based routing will be the standard approach for capability optimization and bias mitigation.
Built with goal-based architecture to solve real capability optimization problems. Contributions welcome.