This file is loaded automatically at the start of every conversation. All rules here are mandatory when generating dashboards, charts, tables, or any metric output from MCP data.
These are the single source of truth. Never deviate without explicit user instruction.
| Name | Formula | When to use | UI label |
|---|---|---|---|
| Session CVR | conversions / sessions |
Cross-channel funnel analysis; GA4 data | CVR (session) |
| Click CVR | conversions / clicks |
Within-platform campaign comparison (Google Ads, Meta Ads) | CVR (click) |
Rule: Never mix session CVR and click CVR on the same chart or table without explicit labelling of each. Session CVR ≈ 2–4% for e-commerce. Click CVR ≈ 5–20%. Values are not comparable across definitions.
| Source | Formula | Notes |
|---|---|---|
| Meta Ads (platform) | Use roas field directly from MCP response |
Platform-reported; last-click, 7-day window |
| Google Ads (estimated) | (conversions × AOV) / spend where AOV = $100 |
Google MCP does not return revenue; use $100 AOV consistently |
| Blended ROAS | total_attributed_revenue / total_spend |
Linear attribution; must label as "Linear · 90d" |
Rule: Never use a hardcoded spend multiplier (e.g., spend × 3.5 or spend × 0.8) to estimate ROAS.
Always use conversions × $100 for Google. Always use platform roas field for Meta.
clicks / impressions — always expressed as a percentage.
| Stage | Definition | Source |
|---|---|---|
| Sessions | GA4 session events | GA4 MCP |
| Conversions | GA4 conversion events | GA4 MCP |
| Deals Won | CRM closed-won deals | HubSpot / Salesforce MCP |
| CRM Contacts | All contacts in CRM (all-time) | HubSpot MCP |
Critical rule: CRM Contacts is an all-time, lifetime count. It must never appear as a funnel step after GA4 Conversions. A funnel step must always be ≤ the step above it. If CRM Contacts > GA4 Conversions, display CRM Contacts as a separate KPI card with label (HubSpot all-time), not as a funnel stage.
closed_won_deals / (closed_won_deals + closed_lost_deals)
| Metric | Window | Model | Label to show in UI |
|---|---|---|---|
| Blended ROAS | 90-day query period | Linear | Linear attribution · 90d |
| Channel ROAS | 90-day query period | Linear | Linear attribution · 90d |
| Platform ROAS (Meta) | 7-day click, 1-day view | Last-click | Meta platform · 7-day window |
| Platform ROAS (Google) | 30-day (default) | Last-click | Google est. · AOV $100 |
| GA4 CVR | Query period | Session-level | Session CVR |
| Pipeline / Closed Won (CRM) | All-time by default | N/A | Always label CRM all-time |
Rule: Every ROAS and CVR KPI card must display its attribution window in the subtitle. Unlabelled ROAS is not acceptable.
| Label | Definition |
|---|---|
Attributed Revenue (90d) |
GA4-tracked orders within the query period, attributed via linear model across paid channels |
Salesforce Closed Won (90d) |
Revenue from SF opportunities closed in the 90-day period |
CRM Closed Won (all-time) |
Lifetime closed-won revenue across HubSpot + Salesforce combined |
When a dashboard shows both a pipeline KPI and a revenue attribution chart, explicitly annotate which scope each uses.
Attribution shares across channels must always sum to exactly 100%. Before rendering any pie/donut chart of channel attribution:
- Sum all values
- If sum ≠ 100, normalise:
value = (value / sum) × 100 - Round to 1 decimal place
A valid funnel requires each step ≤ the step above it:
- Sessions > Engaged Sessions > Conversions > Deals Won
If any data violates this (e.g., a CRM count > GA4 conversions), do not include it in the funnel. Instead:
- Display it as a separate KPI card
- Label its scope clearly (e.g., "all-time", "different source")
Every generated dashboard must include a data freshness indicator showing the query period.
Format: [Start date] – [End date] · Data as of [today]
Default query period: last 90 days.
| Platform | MCP field | Notes |
|---|---|---|
| Google Ads | cost |
In USD |
| Meta Ads | spend |
In USD |
Use cost for Google, spend for Meta. Never swap them.
Average Order Value (AOV) = $100 for all Google ROAS estimates.
This is derived from Meta platform data: average revenue per conversion across Meta campaigns ≈ $99–$101. This assumption must be used consistently. If the user provides a different AOV, update it everywhere in the same response.
The dataset grows daily via daily_synthetic_append.py. The anchor date
auto-detects MAX(date) from the database so all windows always cover
the latest appended data.
| Variable | Resolved at runtime | Notes |
|---|---|---|
anchor |
MAX(date) from DB |
Latest appended day |
window_start |
anchor - 89 days |
90-day window start |
window_end |
anchor |
90-day window end |
Rule for synthetic data: Read window bounds from golden_metrics.json
_meta.window_start / _meta.window_end — never hardcode dates. These are
regenerated daily and always reflect the latest data.
How to identify synthetic data: The dataset is in use when sourcing from
data/olist_analytics.duckdb or data/mock_marketing/*.csv.
golden_metrics.json (schema v2.2) contains these sections (dates roll forward with each daily refresh):
| JSON key | Window | Label |
|---|---|---|
windowed_7d |
anchor-6d → anchor | Last 7 days |
windowed_30d |
anchor-29d → anchor | Last 30 days |
windowed_60d |
anchor-59d → anchor | Last 60 days |
windowed_90d |
anchor-89d → anchor | Canonical 90-day window |
windowed_180d |
anchor-179d → anchor | Last 180 days |
month_YYYY_MM |
full calendar month | Last 3 months relative to anchor |
all_time |
dataset_start → anchor | Full dataset range |
The scheduled refresh (scheduled-refresh.yml) regenerates all windows daily at 06:00 UTC.
For windows NOT listed above (e.g., last 45 days, Q1 2026, a custom range),
use scripts/query_window.py. It queries the CSV source files directly via
DuckDB and returns the same schema as a golden_metrics.json section.
python scripts/query_window.py --last-days 45
python scripts/query_window.py --last-days 1 # yesterday (anchor-relative)
python scripts/query_window.py --month 2025-11 # November 2025
python scripts/query_window.py --year 2025 # full year
python scripts/query_window.py --start 2025-10-01 --end 2026-03-15
python scripts/query_window.py --last-days 60 --output metrics_60d.jsonPython API:
from datetime import date
from scripts.query_window import query_window
section = query_window(date(2026, 1, 1), date(2026, 3, 15))
print(section["blended_roas"]) # same schema as golden_metrics.jsonLabelling rule: Any dashboard built from query_window.py output must display:
⚡ Ad-hoc query · not from golden layer
This distinguishes it from zero-drift golden layer dashboards.
CRM data is always all-time regardless of the query window — it is a
lifetime count (see §1 and §5). query_window.py labels this with _note.
This project is open source. If the user connects a real marketing data source (live MCP, BigQuery, Supabase, Snowflake, or any daily-refreshed dataset), use the rolling window instead:
end_date: today (date.today())start_date: 90 days before today (today - 90 days)
How to identify live data: User says "connect my real data", "use BigQuery", "use production", or configures a non-DuckDB dbt target.
Always pass explicit dates to MCP tools. Never use open-ended queries.
When generating any HTML dashboard or React artifact from MCP data:
- Always read the data first, then compute metrics. Never hardcode values that should come from MCP.
- Label every metric with its formula basis (session vs click CVR, attribution window on ROAS).
- Do not use synthetic multipliers to estimate revenue for Google Ads. Always use
conversions × $100. - Validate funnel stage ordering before rendering. Fix any stage-ordering violations.
- Normalise attribution percentages to 100% before rendering pie/donut charts.
- Separate CRM lifetime data from 90-day attributed data — use different section headings or KPI cards.
- Include a data freshness badge in the dashboard header.
Each skill has two modes: golden (default, reads golden_metrics.json) and
-mcp (live, queries mock MCP servers directly). The golden mode guarantees
zero metric drift; the -mcp mode shows raw real-time platform data.
| Skill | Data source | Primary metric |
|---|---|---|
/marketing |
windowed_90d + all_time sections |
Blended ROAS (linear · 90d) |
/campaign |
windowed_90d.campaigns |
Platform ROAS per campaign |
/attribution |
windowed_90d.attribution_by_channel |
Channel revenue share (linear · 90d) |
/traffic |
windowed_90d.ga4_by_channel |
Session CVR by channel |
/pipeline |
all_time.crm |
Win rate, pipeline value (CRM all-time) |
Triggered by appending -mcp to any skill name or by user requesting "live", "real-time", or "raw platform" data.
| Skill | MCP servers queried | Notes |
|---|---|---|
/marketing-mcp |
google-ads, meta-ads, ga4, hubspot, salesforce | Add ⚡ Live MCP badge |
/campaign-mcp |
google-ads, meta-ads | Per-campaign raw platform data |
/attribution-mcp |
ga4, google-ads, meta-ads | Raw channel attribution |
/traffic-mcp |
ga4 | Live session data |
/pipeline-mcp |
hubspot, salesforce | Live CRM data |
Rule for MCP skills: Pass dates from golden_metrics.json _meta.window_start /
_meta.window_end for synthetic data (these update daily with the appended data).
Use rolling today() - 90 for live datasets (see §9).
Canonical metric definitions live in:
dbt_project/models/metrics/metrics.yml
When in doubt about a metric formula, read that file. It is the single source of truth for computed metrics.
Key metrics:
session_conversion_rate=total_orders / total_sessions← canonical cross-channel CVRblended_roas=total_revenue / total_spend(linear attribution, 90d)channel_roas=channel_revenue / channel_spend(linear attribution, 90d)
dashboards/js/metrics.js is the mandatory runtime enforcement of all metric rules.
All new HTML dashboards MUST load it before any inline script block:
<script src="js/metrics.js"></script>NEVER compute CVR, ROAS, or attribution percentages inline. Use canonical functions:
| Function | Formula | When to use |
|---|---|---|
Metrics.googleROAS(conv, cost) |
conv × $100 / cost |
All Google Ads ROAS |
Metrics.metaROAS(platformRoas) |
pass-through | Meta platform roas field |
Metrics.sessionCVR(conv, sessions) |
conv / sessions × 100 |
Cross-channel, GA4 |
Metrics.clickCVR(conv, clicks) |
conv / clicks × 100 |
Platform campaign tables |
Metrics.normaliseAttribution(channels) |
normalises val to sum 100% |
Any pie/donut/attribution bar |
Metrics.validateFunnel(steps) |
returns issues[] |
Before rendering any funnel |
Use Metrics.labels.* for canonical KPI subtitle strings (attribution windows, CVR basis).
Existing dashboards that already use metrics.js:
attribution_dashboard.htmlmarketing_full_funnel_dashboard.htmlfull_funnel_marketing_dashboard.html
This project's purpose is zero metric drift between dbt and dashboards. The following rules are mandatory.
- Before generating any dashboard, read
dashboards/golden_metrics.json. - This file is pre-computed from the dbt golden layer (DuckDB mart tables).
- Copy exact values from this file into HTML dashboard JS constants. Do NOT recalculate metrics independently.
- Check
_meta.available_windowsto see which windows are pre-computed. - Use
windowed_90dfor the canonical 90-day view. - Use
all_timeonly when explicitly showing lifetime data. - For any other window (30d, monthly, custom range) — see §9
query_window.py.
Why: If you recalculate metrics from MCP data, rounding differences and non-deterministic AI arithmetic will cause drift versus the golden layer. Reading pre-computed values guarantees bit-for-bit reproducibility.
Only bypass golden_metrics.json when the user explicitly says one of:
- "query live data" / "get real-time data" / "use MCP"
- "show raw platform data" / "bypass golden layer" / "live from MCP"
When in Live MCP mode:
- Pass dates from
golden_metrics.json_meta.window_start/_meta.window_end - Add to the dashboard header badge:
⚡ Live MCP — may differ from golden layer - Still use
metrics.jscanonical functions for all calculations - For HubSpot/Salesforce, pass the same dates to
get_deal_pipeline_summary()andget_opportunity_pipeline()— these servers now support date filtering.
- Never divide all-time revenue by 90-day spend (this was the source of the nonsensical 71.1× Blended ROAS bug)
- If using
windowed_90d: ALL numbers must come from that section only - If using
all_time: ALL numbers must come from that section only - Label every ROAS and CVR with its scope per §2 Attribution Windows rules
User asks for data or a dashboard
│
├─ Skill ends in "-mcp" OR user says "live", "real-time", "raw platform"?
│ ├─ YES → Use MCP servers
│ │ · Synthetic: dates from golden_metrics.json _meta.window_start/end
│ │ · Live/real: dates = today()-90d → today()
│ │ · Add ⚡ Live MCP badge
│ │ · Use metrics.js for all calculations
│ └─ NO → Use golden layer (default)
│
└─ What date window does the user want?
│
├─ 90d (default), 7d, 30d, 60d, 180d, or a recent calendar month?
│ └─ Read dashboards/golden_metrics.json
│ · Check _meta.available_windows for the exact key
│ · Copy exact values — no recalculation
│ · Use all_time only for lifetime/CRM views
│
├─ Any other window (45d, last quarter, Q1, custom range, etc.)?
│ └─ Run python scripts/query_window.py
│ · --last-days N | --month YYYY-MM | --year YYYY
│ · --start YYYY-MM-DD --end YYYY-MM-DD
│ · OR: from scripts.query_window import query_window
│ · Label dashboard: ⚡ Ad-hoc query · not from golden layer
│ · CRM data is still all-time in the result
│
└─ Does golden_metrics.json exist?
├─ YES → Use it
└─ NO → Tell user: python scripts/generate_golden_metrics.py