Skip to content

Commit e2a3f2a

Browse files
committed
Implement comprehensive safety controls for AI agents
- Added human approval policy to ensure high-risk actions are authorized by a human before execution. - Introduced model routing policy to govern which AI models can process sensitive tasks, preventing unapproved models from handling critical workloads. - Implemented PII leakage prevention measures to scan agent outputs for personal identifiable information (PII) before delivery. - Established prompt injection prevention controls to detect and mitigate potential injection attacks in user inputs. - Defined tool permission controls to manage which tools an agent may call, including allowed, denied, and restricted lists requiring human approval. - Developed tests for each policy to validate functionality and compliance with safety standards.
1 parent 875bc9c commit e2a3f2a

32 files changed

Lines changed: 3324 additions & 1378 deletions

.regal/config.yaml

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,27 @@ rules:
22
idiomatic:
33
# Flat directory layout under policies/rego/ is intentional — all policy packs
44
# in one location for direct opa eval usage. Package names use dot-notation
5-
# (agt_policies_nigeria.*, agt_policies_africa.*) not directory hierarchy.
5+
# (agt_policies_nigeria.*, agt_policies_africa.*, agt_policies_agent.*) not directory hierarchy.
66
directory-package-mismatch:
77
level: ignore
88
# Policy packs are libraries, not executables — no single entrypoint by design.
9-
# Callers query data.agt_policies_nigeria.<pack>.decision directly.
9+
# Callers query data.agt_policies_*.*.decision directly.
1010
no-defined-entrypoint:
1111
level: ignore
12+
imports:
13+
# data.config.* paths are runtime deployer configuration injected at evaluation
14+
# time — not bundled policy data. Regal cannot resolve them statically by design.
15+
# Callers pass config via: opa eval -d data.json or OPA bundle data layer.
16+
unresolved-reference:
17+
level: ignore
18+
style:
19+
# Regulatory compliance error messages include citation codes, regulation names,
20+
# and section references that are necessarily verbose (e.g. "CBN FPR/DIR/GEN/CIR/07/003").
21+
# Truncating them would reduce the utility of policy decisions for compliance officers.
22+
line-length:
23+
level: ignore
24+
# Universal policy packs use `else :=` to fall back to named constant rules
25+
# (e.g. _default_patterns). The `default` keyword cannot reference another rule,
26+
# so `else :=` is the only way to keep defaults readable and DRY.
27+
default-over-else:
28+
level: ignore

CHANGELOG.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,28 @@ Versioning: [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

88
---
99

10+
## [1.2.0] — 2026-06-16
11+
12+
Universal agent safety controls — 5 new policy packs applicable to any AI agent.
13+
14+
### Added
15+
16+
- **Prompt Injection** (`agent-prompt-injection.yaml` + `.rego`) — blocks 19 known injection phrases; escalates structural markers (`[INST]`, `<|system|>`, `###System`). Configurable pattern sets via `data.config.prompt_injection.*`. 22 tests.
17+
- **PII Leakage** (`agent-pii-leakage.yaml` + `.rego`) — scans agent output for credit card numbers, BVN/NIN (11-digit), SA ID (13-digit), email, and phone before delivery. Deployer allow-list for verified disclosure flows. 21 tests.
18+
- **Tool Permissions** (`agent-tool-permissions.yaml` + `.rego`) — allow/deny/restricted-list tool governance. Default restricted set includes `delete_record`, `execute_code`, `shell_exec`, `deploy`, `grant_admin` and 10 others. 20 tests.
19+
- **Human Approval** (`agent-human-approval.yaml` + `.rego`) — four escalation triggers: explicit action names, context `risk_level`, amount threshold (default 1M, override to 5M for CBN), bulk record count (default 500). 21 tests.
20+
- **Model Routing Controls** (`agent-model-routing.yaml` + `.rego`) — prevents sensitive tasks (`pii_processing`, `financial_decision`, `fraud_detection`, `kyc_review`, `aml_screening`, 9 total) from using unapproved models. Audits approved model usage. 22 tests.
21+
- **Jurisdiction router updated**`universal_policies` set always included in `applicable_policies`. NG now routes 9 packs (4 regulatory + 5 universal); KE/ZA route 6. Router tests updated to reflect new counts.
22+
- **Regal config updated**`line-length`, `default-over-else`, `unresolved-reference` (data.config paths) suppressed with explanatory comments.
23+
24+
### Changed
25+
26+
- README: description updated from "African regulatory compliance" to "two-layer governance" (universal safety + regulatory)
27+
- README: Coverage section split into Universal Agent Safety Controls and African Regulatory Compliance tables
28+
- `.regal/config.yaml`: 3 additional suppression entries with documented rationale
29+
30+
---
31+
1032
## [1.1.0] — 2026-06-15
1133

1234
Framework integrations: CrewAI and Microsoft AutoGen.

README.md

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@
44

55
**Nigerian & African AI Agent Governance Policies for Microsoft's [Agent Governance Toolkit (AGT)](https://github.com/microsoft/agent-governance-toolkit)**
66

7-
A community policy pack that extends AGT with compliance coverage for African regulatory frameworks — NDPA 2023, CBN regulations, NFIU/AML rules, POS geo-fencing, BVN/NIN data protection, and POPIA (South Africa).
7+
A community policy pack that extends AGT with two governance layers:
8+
- **Universal agent safety controls** — prompt injection, PII leakage, tool permissions, human approval, model routing (apply to any AI agent regardless of jurisdiction)
9+
- **African regulatory compliance** — NDPA 2023, CBN regulations, NFIU/AML rules, BVN/NIN data protection, Kenya DPA, POPIA (jurisdiction-routed)
810

911
Two policy formats:
1012
- **YAML** (`policies/*.yaml`) — drop-in rules files, validated by the AGT linter, no new infrastructure
@@ -22,6 +24,22 @@ This repo fills that gap.
2224

2325
## Coverage
2426

27+
### Universal Agent Safety Controls (`agt_policies_agent.*`)
28+
29+
Apply to **every agent action** regardless of customer country or industry. Deployer-configurable via `data.config.*`.
30+
31+
| Policy Pack | Alignment | Key Controls |
32+
|---|---|---|
33+
| `agent-prompt-injection.yaml` / `.rego` | OWASP LLM01, NIST AI RMF | Blocks known injection phrases; escalates structural markers (`[INST]`, `<\|system\|>`) |
34+
| `agent-pii-leakage.yaml` / `.rego` | OWASP LLM06, NDPA s.25, POPIA s.19 | Scans agent OUTPUT for credit cards, BVN/NIN, SA IDs, emails, phone numbers |
35+
| `agent-tool-permissions.yaml` / `.rego` | OWASP LLM08, NIST AI RMF | Allow/deny/restrict tool calls; blocks excessive agency |
36+
| `agent-human-approval.yaml` / `.rego` | EU AI Act Art. 14, CBN Maker-Checker | Escalates high-risk actions, high amounts, bulk operations, high risk_level |
37+
| `agent-model-routing.yaml` / `.rego` | OWASP LLM03/LLM05, NIST AI RMF | Prevents sensitive tasks (PII, AML, KYC) from using unapproved models |
38+
39+
### African Regulatory Compliance
40+
41+
Jurisdiction-routed: policies activate based on `customer_country` in context.
42+
2543
| Policy Pack | Regulation | Key Controls |
2644
|---|---|---|
2745
| `ndpa-data-residency.yaml` | Nigeria Data Protection Act 2023 | Cross-border transfer restrictions, sensitive data handling, data minimisation |
@@ -34,14 +52,19 @@ This repo fills that gap.
3452

3553
### OPA Rego (structured-parameter enforcement)
3654

37-
| Rego Policy | Regulation | Key Advantage over YAML |
55+
| Rego Policy | Package | Key Advantage over YAML |
3856
|---|---|---|
39-
| `policies/rego/cbn-transaction-limits.rego` | CBN NIP/KYC | Checks `input.params.amount` directly — exact numeric enforcement, not text regex |
40-
| `policies/rego/bvn-nin-protection.rego` | CBN BVN / NIMC NIN | Checks `input.params.identifier_type` and `input.params.bvn_present` in structured params |
41-
| `policies/rego/ndpa-data-residency.rego` | NDPA 2023 s.25 | Checks `input.params.destination_region` and `input.params.record_count` — unambiguous |
42-
| `policies/rego/nfiu-aml.rego` | NFIU AML/CFT (MLPPA 2022) | Exact ₦5M CTR threshold on `input.params.amount`, structuring zone (₦4.5M–₦4.99M) |
43-
| `policies/rego/kdpa-data-protection.rego` | Kenya DPA 2019 s.49 | Cross-border transfers, sensitive data, biometric blocking, ODPC accountability |
44-
| `policies/rego/popia-south-africa.rego` | POPIA (Act 4 of 2013) | `destination_country` adequacy list (POPIA s.72), SA ID 13-digit format validation |
57+
| `agent-prompt-injection.rego` | `agt_policies_agent.prompt_injection` | RE2 pattern matching on user-controlled fields; configurable pattern sets |
58+
| `agent-pii-leakage.rego` | `agt_policies_agent.pii_leakage` | Output scanning: BVN/NIN regex, card number pattern, SA ID 13-digit, email, phone |
59+
| `agent-tool-permissions.rego` | `agt_policies_agent.tool_permissions` | Allowlist/denylist/restricted-list logic; structured set operations |
60+
| `agent-human-approval.rego` | `agt_policies_agent.human_approval` | Numeric amount threshold, record count threshold, risk_level from context |
61+
| `agent-model-routing.rego` | `agt_policies_agent.model_routing` | Approved model set per sensitive task_type; banned model enforcement |
62+
| `cbn-transaction-limits.rego` | `agt_policies_nigeria.cbn` | Checks `input.params.amount` directly — exact numeric enforcement, not text regex |
63+
| `bvn-nin-protection.rego` | `agt_policies_nigeria.bvn_nin` | Checks `input.params.identifier_type` and `input.params.bvn_present` in structured params |
64+
| `ndpa-data-residency.rego` | `agt_policies_nigeria.ndpa` | Checks `input.params.destination_region` and `input.params.record_count` — unambiguous |
65+
| `nfiu-aml.rego` | `agt_policies_nigeria.nfiu` | Exact ₦5M CTR threshold on `input.params.amount`, structuring zone (₦4.5M–₦4.99M) |
66+
| `kdpa-data-protection.rego` | `agt_policies_africa.kdpa` | Cross-border transfers, sensitive data, biometric blocking, ODPC accountability |
67+
| `popia-south-africa.rego` | `agt_policies_africa.popia` | `destination_country` adequacy list (POPIA s.72), SA ID 13-digit format validation |
4568

4669
---
4770

policies/agent-human-approval.yaml

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# agt-policies-nigeria
2+
# AI Agent Human Approval Controls — Universal Safety Control
3+
#
4+
# Regulatory alignment:
5+
# OWASP Agentic AI Top 10 — LLM09 Overreliance / LLM06 Excessive Agency
6+
# EU AI Act Art. 14 — human oversight for high-risk AI systems
7+
# NIST AI RMF — GOVERN 1.3, MANAGE 2.4
8+
# CBN Maker-Checker Requirement — dual authorisation for financial transactions
9+
# NFIU MLPPA 2022 — human review for Suspicious Transaction Reports
10+
#
11+
# ⚠️ This is a universal safety control — jurisdiction-independent.
12+
# Ensures a human authorises high-risk, high-value, or sensitive actions
13+
# before an AI agent executes them autonomously.
14+
#
15+
# Four escalation triggers:
16+
# 1. Action name is on the required_actions list
17+
# 2. context.risk_level is in the approval-required risk levels set
18+
# 3. params.amount exceeds the configured amount threshold
19+
# 4. params.record_count exceeds the configured bulk operation threshold
20+
21+
version: "1.0"
22+
name: agent-human-approval
23+
description: >
24+
Requires human authorisation before an AI agent executes high-risk,
25+
high-value, or sensitive actions. Configurable triggers: explicit action
26+
names, context risk level, transaction amount threshold, and bulk operation
27+
record count threshold.
28+
29+
config:
30+
# Runtime configuration keys (passed via OPA data.config.human_approval.*)
31+
required_actions:
32+
description: >
33+
Set of action names that always require human approval, regardless of
34+
amount or risk level. Defaults to account deletion, bulk operations,
35+
permission changes, and financial self-approval.
36+
example: ["delete_account", "close_account", "grant_admin", "deploy_code"]
37+
risk_levels:
38+
description: >
39+
Set of context.risk_level values that require human approval.
40+
Defaults to ["critical", "high"]. Callers set risk_level in context.
41+
example: ["critical", "high"]
42+
amount_threshold:
43+
description: >
44+
Numeric amount above which transactions require human approval.
45+
Units are in the local currency of the deployer. Default: 1,000,000.
46+
Nigerian deployments should set to 5,000,000 (CBN CTR threshold).
47+
example: 5000000
48+
bulk_threshold:
49+
description: >
50+
Record count above which bulk operations require human approval.
51+
Default: 500 records.
52+
example: 500
53+
54+
rules:
55+
56+
# =========================================================================
57+
# Escalate: Explicit Approval-Required Action
58+
# =========================================================================
59+
60+
- name: human-approval-required-action
61+
description: >
62+
The action is on the deployer-configured required_actions list.
63+
Must be authorised by a human before the agent executes it.
64+
action: escalate
65+
priority: 100
66+
default_required_actions:
67+
- delete_account
68+
- close_account
69+
- bulk_delete
70+
- mass_update
71+
- send_bulk_email
72+
- send_bulk_sms
73+
- deploy_code
74+
- modify_permissions
75+
- grant_admin
76+
- revoke_access
77+
- initiate_bulk_refund
78+
- approve_transfer
79+
- self_approve
80+
81+
# =========================================================================
82+
# Escalate: High Risk Level
83+
# =========================================================================
84+
85+
- name: human-approval-risk-level
86+
description: >
87+
context.risk_level is set to a value that requires human approval
88+
(default: "critical" or "high"). Callers are responsible for
89+
assigning risk levels based on action classification.
90+
action: escalate
91+
priority: 90
92+
93+
# =========================================================================
94+
# Escalate: Amount Threshold
95+
# =========================================================================
96+
97+
- name: human-approval-amount-threshold
98+
description: >
99+
Transaction amount (params.amount) exceeds the configured threshold.
100+
Default threshold is 1,000,000 units. Nigerian fintech deployments
101+
should override to 5,000,000 (CBN CTR reporting threshold).
102+
action: escalate
103+
priority: 85
104+
105+
# =========================================================================
106+
# Escalate: Bulk Operation Threshold
107+
# =========================================================================
108+
109+
- name: human-approval-bulk-threshold
110+
description: >
111+
Bulk operation record count (params.record_count) exceeds the
112+
configured threshold. Default: 500 records.
113+
action: escalate
114+
priority: 80

policies/agent-model-routing.yaml

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# agt-policies-nigeria
2+
# AI Agent Model Routing Controls — Universal Safety Control
3+
#
4+
# Regulatory alignment:
5+
# OWASP Agentic AI Top 10 — LLM03 Model Theft / LLM05 Supply Chain
6+
# NIST AI RMF — GOVERN 1.1, GOVERN 2.2
7+
# EU AI Act Art. 9 — risk management for high-risk AI systems
8+
# NDPA 2023 s.24 — appropriate technical measures for processing
9+
# CBN Risk-Based Supervision — model risk management requirements
10+
#
11+
# ⚠️ This is a universal safety control — jurisdiction-independent.
12+
# Governs which AI models may process which task types.
13+
# Prevents sensitive workloads (PII processing, financial decisions,
14+
# fraud detection) from being routed to unapproved or unverified models.
15+
#
16+
# Governance flow:
17+
# model on banned list → deny
18+
# sensitive task + unapproved model → deny
19+
# sensitive task + no model specified → escalate
20+
# sensitive task + approved model → audit
21+
# non-sensitive task → allow
22+
23+
version: "1.0"
24+
name: agent-model-routing
25+
description: >
26+
Controls which AI models may process which task types. Prevents sensitive
27+
workloads (PII processing, financial decisions, fraud detection, AML screening,
28+
KYC review) from being routed to unapproved, unverified, or banned models.
29+
Deployer-configurable model lists and sensitive task type classifications.
30+
31+
config:
32+
# Runtime configuration keys (passed via OPA data.config.model_routing.*)
33+
sensitive_task_types:
34+
description: >
35+
Set of task_type values (from context.task_type) that require an
36+
explicitly approved model. Defaults to: pii_processing, financial_decision,
37+
fraud_detection, medical_advice, legal_advice, authentication,
38+
credit_scoring, kyc_review, aml_screening.
39+
example: ["pii_processing", "financial_decision", "hr_decision"]
40+
approved_sensitive_models:
41+
description: >
42+
Set of model identifiers approved to process sensitive tasks.
43+
Defaults to a list of major frontier models. Override to restrict
44+
to on-premises or enterprise-contracted models only.
45+
example: ["gpt-4o", "claude-opus-4", "internal-compliant-model-v2"]
46+
banned_models:
47+
description: >
48+
Set of model identifiers unconditionally blocked from any task.
49+
Use to prevent use of deprecated, unverified, or non-compliant models.
50+
example: ["gpt-3.5-turbo", "unknown-model-xyz"]
51+
require_model_on_sensitive:
52+
description: >
53+
Boolean. If true (default), a sensitive task with no model specified
54+
in context.model escalates rather than allows.
55+
example: true
56+
57+
rules:
58+
59+
# =========================================================================
60+
# Deny: Banned Model
61+
# =========================================================================
62+
63+
- name: model-routing-banned
64+
description: >
65+
The requested model (context.model) is on the banned list.
66+
Blocked unconditionally regardless of task type.
67+
action: deny
68+
priority: 100
69+
70+
# =========================================================================
71+
# Deny: Unapproved Model on Sensitive Task
72+
# =========================================================================
73+
74+
- name: model-routing-unapproved-for-sensitive-task
75+
description: >
76+
A sensitive task type (pii_processing, financial_decision, etc.) was
77+
requested with a model that is not on the approved_sensitive_models list.
78+
Use an approved frontier or enterprise-contracted model instead.
79+
action: deny
80+
priority: 90
81+
default_sensitive_task_types:
82+
- pii_processing
83+
- financial_decision
84+
- fraud_detection
85+
- medical_advice
86+
- legal_advice
87+
- authentication
88+
- credit_scoring
89+
- kyc_review
90+
- aml_screening
91+
92+
# =========================================================================
93+
# Escalate: Sensitive Task with No Model Specified
94+
# =========================================================================
95+
96+
- name: model-routing-no-model-for-sensitive
97+
description: >
98+
A sensitive task type was requested but context.model was not specified
99+
or was empty. Escalate to ensure a human explicitly selects an approved
100+
model before the task proceeds.
101+
action: escalate
102+
priority: 85
103+
104+
# =========================================================================
105+
# Audit: Sensitive Task with Approved Model
106+
# =========================================================================
107+
108+
- name: model-routing-audit-approved-sensitive
109+
description: >
110+
A sensitive task type was processed with an approved model.
111+
Log for model governance and compliance audit trail.
112+
action: audit
113+
priority: 50

0 commit comments

Comments
 (0)