feat(plugins): add ATR threat detection plugin#4109
Conversation
Add regex-based AI agent threat detection using ATR community rules: - Scans prompts, tool invocations, tool results, and resources - 20 bundled rules covering OWASP Agentic Top 10 - Configurable blocking, severity threshold - Pure regex, no API keys, <5ms per scan Source: https://agentthreatrule.org (MIT) Complements existing security plugins (secrets_detection, encoded_exfil_detection) Signed-off-by: Panguard AI <support@panguard.ai>
- Add test_atr_threat_detection.py with tests for all 4 hooks - Fix tool_pre_invoke to use payload.args directly - Add Authors field to file header - Extract reverse severity dict to module constant _SEVERITY_NAMES Signed-off-by: Panguard AI <support@panguard.ai>
|
@crivetimihai @madhav165 — friendly bump on this one. Following the This week's mcp-atlassian disclosures (CVE-2026-27825 / CVE-2026-27826, Pluto Security re-surfaced as "MCPwnfluence") made the regex-based MCP threat-detection case a bit more concrete, if that helps prioritize triage. The underlying ATR ruleset also now covers 97.1% of NVIDIA garak's 666 real-world jailbreaks, which gives the plugin's recall some external grounding. Happy to split, narrow scope, or drop any piece — just let me know what shape is preferred. |
|
@crivetimihai @kevalmahajan @madhav165 — gentle bump on this PR (no urgency, just keeping it visible). Since the last comment (4/21), the underlying ATR ruleset has shipped two more major integrations: Cisco AI Defense merged the full 314-rule pack (#99, 4/22) and Microsoft Agent Governance Toolkit followed (#1277, 4/26, with a weekly auto-sync workflow). The plugin in this PR remains unchanged from the 4/21 submission — 18 tests, follows the secrets_detection template exactly, disabled-by-default. If there's anything I can split, narrow, or adjust to match mcp-context-forge plugin standards, please let me know. Thanks for the time. |
The plugin in this PR is unchanged from the 2026-04-21 submission — same secrets_detection template, disabled-by-default, priority 53, 18 tests. The new rules ship via the upstream Two ecosystem signals worth flagging since the original PR:
If anything in the plugin needs reshaping to match mcp-context-forge plugin standards (file layout, priority number, detection-tier integration, test framework), I am happy to take that pass — please drop the specifics in a review comment and I will turn it around in a day. If the right shape is "split this PR into two" (one for the plugin scaffold, one for the rules-bundle hookup) I am also happy to do that. Thanks for the time. |
|
@crivetimihai @kevalmahajan @madhav165 — quick FYI, not another bump. ATR shipped v2.1.0 today ( If this PR lands as-is, bumping the dep to Mapping reference: https://agentthreatrule.org/en/compliance/nist-ai-rmf Happy to open a one-line dep bump as a separate PR if that's cleaner. |
|
@crivetimihai @kevalmahajan @madhav165 — two updates since the 5/9 FYI.
Still available for any open review items. |
|
@crivetimihai @kevalmahajan @madhav165 — quick update on ATR's standardisation footprint since the 5/9 FYI. ATR landed in MISP at two layers on 2026-05-10, both merged by adulau (MISP project lead):
Relevant for mcp-context-forge enterprise consumers: threat findings emitted from the ATR plugin in #4109 will carry full MISP cluster context downstream when routed into any MISP-compatible CSIRT or SIEM pipeline — no translation layer, no custom enrichment. The galaxy half is the substantive one; it's the cluster shape CSIRTs use for incident triage. Still available for any open review items on #4109. |
|
@jonpspri — sorry for the noise. This PR has been waiting on |
|
Closing this for now to keep your queue clean — it hasn't picked up a review slot. The ATR integration is ready whenever it's useful; glad to reopen or split into a smaller diff if you'd like to revisit. Thanks! |
Summary
Add an ATR (Agent Threat Rules) threat detection plugin that scans MCP payloads for known AI agent attack patterns using community-maintained regex rules.
Resolves #4108
Hooks implemented
prompt_pre_fetchtool_pre_invoketool_post_invokeresource_post_fetchDesign
secrets_detectionplugin pattern exactlyblock_on_detectionandmin_severitythresholdmode: "disabled"by defaultFiles
plugins/atr_threat_detection/atr_threat_detection.pyplugins/atr_threat_detection/rules.jsonplugins/atr_threat_detection/plugin-manifest.yamlplugins/atr_threat_detection/__init__.pyplugins/atr_threat_detection/README.mdplugins/config.yamltests/unit/plugins/test_atr_threat_detection.pyTest plan
make lint/make test/make coverageSigned-off-by: eeee2345