AI/ML DFIR Detection Pack

Open-source detection signatures for AI/ML attacks and breaches.

A vendor-neutral collection of Sigma, YARA, and Suricata rules for detecting compromise of LLM applications, MCP servers, ML supply chains, AI infrastructure, AI-powered insider threats, and RAG/vector database attacks.

For the companion investigation guide with attack background, forensic artifacts, Mermaid attack-chain diagrams, and hands-on investigation procedures, see docs/ai-dfir-investigation-guide.md.

License: Apache 2.0.

Disclaimer

This is an independent personal project. It is not affiliated with, endorsed by, or produced on behalf of any employer. All research is based on public sources — published CVEs, vendor advisories, academic papers, and vendor-neutral security research. Contributions are welcome from the community.

Why this exists

Most existing detection content is either locked behind vendor SIEMs or scattered across blog posts. AI/ML attacks need detection coverage that spans:

Endpoint (Claude Desktop / Cursor / Copilot config tampering)
Cloud SaaS logs (Bedrock, Azure OpenAI, M365 Copilot)
Network (vector DB exfil, model exfil, ShadowRay C2)
File artifacts (poisoned pickle models, malicious MCP configs)

This pack uses open standards only so the rules can be deployed in any modern detection stack via open Sigma/YARA/Suricata tooling.

Structure

ai-dfir-toolkit/
├── 01-llm-prompt-injection/      # Prompt injection, jailbreaks, indirect injection
├── 02-mcp-attacks/                # MCP tool poisoning, config tampering, rug pulls
├── 03-model-supply-chain/         # Pickle exploits, HuggingFace, dependency confusion
├── 04-ai-infrastructure/          # ShadowRay, Triton, MLflow, GPU abuse
├── 05-copilot-assistant-abuse/    # M365 Copilot, GitHub Copilot, Claude, Cursor
├── 06-rag-vector-db/              # Vector DB exposure, RAG poisoning
├── tests/                         # Sample events / test files
├── MAPPINGS.md                    # ATLAS + OWASP cross-reference
└── README.md

Each category directory contains a README.md describing the threats covered and rule files in their native format (.yml for Sigma, .yar for YARA, .rules for Suricata).

Rule formats

Format	Use Case	Where it Deploys
Sigma (`.yml`)	Generic log-based detection	Any SIEM via pySigma backends
YARA (`.yar`)	File / memory artifacts	EDR platforms, malware analysis, file scanning pipelines
Suricata (`.rules`)	Network traffic	Suricata, Snort (compatible subset), Zeek (via translation)

All rules use open formats. No vendor-specific query languages, no proprietary field schemas. Convert to your platform using pySigma backends.

Quick start

Elastic / Kibana

pip install sigma-cli pysigma-backend-elasticsearch
sigma convert -t lucene --without-pipeline ai-dfir-toolkit/**/*.yml > ai-dfir-toolkit.lucene

The rules are vendor-neutral Sigma — see the pySigma backends list to convert to any other SIEM query language.

Case sensitivity: the keyword-based rules (prompt injection, jailbreak, system-prompt extraction, etc.) follow the Sigma convention that contains matching is case-insensitive — attacker text varies in case, so the rules are written in lowercase and rely on the backend to fold case. On Elastic, ensure the matched content fields are mapped as analyzed text (the default for string fields), not keyword: a keyword mapping matches case-sensitively and will miss capitalized input. Do not try to force this with the re|i modifier — Lucene's regex engine does not support the (?i) flag and the resulting query is invalid.

YARA scanning

# Scan a model directory
yara -r ai-dfir-toolkit/03-model-supply-chain/*.yar /path/to/models/

# Scan an MCP config
yara ai-dfir-toolkit/02-mcp-attacks/mcp_tool_poisoning.yar \
  ~/Library/Application\ Support/Claude/claude_desktop_config.json

Suricata

cp ai-dfir-toolkit/**/*.rules /etc/suricata/rules/
echo 'rule-files: [ai-dfir.rules]' >> /etc/suricata/suricata.yaml
suricata -T -c /etc/suricata/suricata.yaml  # validate
systemctl reload suricata

Coverage overview

43 rule files containing 114 individual signatures across six categories:

Category	Files	Signatures	ATLAS Techniques	OWASP LLM
LLM Prompt Injection	8	10	T0051, T0054, T0029	LLM01, LLM07, LLM10
MCP Attacks	5	14	T0010, T0110, T0086	LLM03, LLM06
Model Supply Chain	8	23	T0010, T0018, T0020	LLM03, LLM04
AI Infrastructure	9	31	T0011, T0017, T0019	LLM10
Copilot/Assistant Abuse	8	19	T0086, T0024	LLM02, LLM06
RAG / Vector DB	5	17	T0020	LLM08
Total	43	114

Signature count includes multi-document Sigma YAML, multiple rule blocks inside a single YARA file, and multiple alert lines inside a single Suricata .rules file. One file often covers several related variants.

See MAPPINGS.md for per-rule mappings.

Tuning notes

These rules are written to err toward signal over noise, but every environment is different. Each rule includes:

falsepositives: section listing known FP scenarios
level: field (low / medium / high / critical) — start with high+ in production
Tunable selectors so you can scope to specific orgs/users/namespaces

Recommended rollout:

Deploy to a test index/workspace for 7 days
Triage hits, tune selection and filter blocks
Promote to production with appropriate severity

Testing

The tests/ directory contains sample artifacts (malicious and benign) for validating rule correctness. Run:

cd tests/
./validate.sh

Expected result: 7 passes, 0 failures.

Contributing

PRs welcome. Rule submission requirements:

Reference an attack technique — ATLAS ID, CVE, or published research
Include test data in tests/ — sample log line, file, or PCAP
Document false positives in the rule
Use lowercase, snake_case filenames matching the threat
Tag with attack.atlas.txxxx in Sigma rules

Sigma format: follow the Sigma specification.

References

License

Apache License 2.0 — see LICENSE.

You are free to use, modify, and redistribute these rules in commercial and non-commercial settings. Attribution appreciated but not required.

Maintainer: Raymond DePalma — independent security researcher. This is a personal project and is not affiliated with any employer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI/ML DFIR Detection Pack

Disclaimer

Why this exists

Structure

Rule formats

Quick start

Elastic / Kibana

YARA scanning

Suricata

Coverage overview

Tuning notes

Testing

Contributing

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
01-llm-prompt-injection		01-llm-prompt-injection
02-mcp-attacks		02-mcp-attacks
03-model-supply-chain		03-model-supply-chain
04-ai-infrastructure		04-ai-infrastructure
05-copilot-assistant-abuse		05-copilot-assistant-abuse
06-rag-vector-db		06-rag-vector-db
docs		docs
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MAPPINGS.md		MAPPINGS.md
README.md		README.md
SECURITY.md		SECURITY.md

Folders and files

Latest commit

History

Repository files navigation

AI/ML DFIR Detection Pack

Disclaimer

Why this exists

Structure

Rule formats

Quick start

Elastic / Kibana

YARA scanning

Suricata

Coverage overview

Tuning notes

Testing

Contributing

References

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages