Security Architecture

Myrm implements a defense-in-depth security model with six layers, ensuring agents operate safely even when given broad autonomy.

Security Layers

Layer	Function	How It Works
L1	Budget Control	MultidimensionalBudgetGuard (per-session / daily / per-call) with 4-level progressive response (warn → eco → finalize → block) + Per-channel budget isolation (DailyBudgetGuard + SSE alerts + DB persistence) + GoalBudget (max_tokens / max_usd / max_time / max_turns)
L2	Permission	12-dimension tool and resource access policies
L3	Rate Limiting	HTTP signature detection + minimum recovery time + SSE event throttling
L4	Loop Detection	7 detectors (repetition, ping-pong, no-progress, divergence, output-diminishing, consecutive-failures, error-signature) + FrequencyGuard (100 calls/60s global, 30/tool/60s) + progressive WARN→BREAK hard-stop
L5	PII Protection	Automatic PII detection, redaction, and taint tracking
L5.5	Trajectory Classification	Behavioral analysis with blind trajectory classifier for anomaly detection

Approval Modes

Control how much autonomy agents have:

Mode	Description
Auto	Read-only operations auto-approved, writes require confirmation
YOLO	All operations auto-approved (for trusted environments)
HITL	Human-in-the-loop approval for every action
Always-Allow	Per-tool permanent approval with 4-level granularity (permission / tool / exact-args / command-pattern)
Domain-HITL	Approval based on domain/resource classification

Session Security Presets

Switch security posture per chat session from the input toolbar — no need to change global settings:

Preset	Behavior	Use Case
HITL (default)	Every tool call requires approval	Sensitive tasks, production environments
Auto-Approve Edits	File reads/writes auto-approved; shell, browser, MCP require approval with AI-powered smart review	Coding, document editing, routine development
Read-Only	All write operations denied; reads auto-approved	Code review, exploration, research

Presets are mutually exclusive with YOLO mode — selecting a non-default preset automatically disables YOLO, and vice versa.
The Auto-Approve Edits preset enables the Transcript Classifier (LLM-based smart review) for shell commands — safer than blanket ALLOW because suspicious commands still trigger human review.
The Read-Only preset precisely denies 12 categories of write operations (file write/edit/delete, shell, code interpreter, browser automation, skill/cron management) while keeping reads and agent delegation open.
Available in Agent mode only; the selector hides automatically in Fast Search mode.

10-Layer Progressive Approval Architecture

Every tool call passes through up to 10 layers of deterministic and intelligent checks before reaching the user:

Layer	Mechanism	What it does
L0	YOLO Full-Auto	Auto-approves everything except hard DENY rules
L1	CapabilitySet	Checks declared capabilities with negative-priority deny rules
L2	Command Risk Classifier	Categorizes shell commands by risk level (SAFE / UNKNOWN / DANGEROUS)
L3	URL Domain Allowlist	Auto-allows network access to trusted domains
L4	Path Policy	Enforces file access boundaries (workspace-only, deny-list, etc.)
L5	Fast-Path Read-Only MCP	Auto-allows MCP tools with `readOnlyHint=true` and no destructive/open-world flags
L6	Allowlist Persistence	4-level matching: permission → tool → exact args hash → command glob pattern
L7	Taint-Aware Escalation	Escalates already-ALLOW tools to ASK when session contains tainted data (PII, credentials)
L8	Cron Capability Pre-Approval	Auto-allows declared capabilities in scheduled tasks; fail-closed without declarations
L9	Domain HITL Runtime	Auto-allows previously-approved domains within the same session
L10	LLM Security Review	AI classifier for ASK/Taint/Shell-Escalation/Outbound-Check scenarios

Multi-Platform Approval UX

Every approval surfaces the same four actions across all channels:

Action	WebUI	Telegram/Slack/Feishu
Approve (once)	Button	`/approve`, `1`, `y`, 👍 emoji
Edit & Approve	Inline editor with re-validation	N/A (edit in WebUI)
Reject (with feedback)	Button + feedback textarea	`/deny`, `2`, `n`, 👎 emoji
Allow Always	Button → confirmation dialog (4 scopes for shell: permission / tool / exact / pattern)	`/approve-always`, `!y`, ♾️ emoji

Allow Always offers four granularity levels:

Permission: Allow all tools with this permission type (e.g., all file writes)
Tool: Allow this specific tool regardless of arguments
Exact: Allow this tool only with these exact arguments (default for shell tools — safest)
Pattern (shell only): Allow commands matching a derived glob (e.g. curl -sS *). Compound shell (&&, |, ;) is never saved. All pattern rows appear in Settings → Allowlist and can be deleted anytime.

Migration benefit: Approve a recurring deploy script once with “Allow always (this pattern)” — later runs auto-approve without YOLO. Claude Code and OpenClaw typically stop at tool-name allowlists or CLI-only signing; Myrm gives you GUI-managed, revocable pattern rules with Chrome LIVE E2E proof (Jul 2026).

Batch operations are supported via /batch a,d,aa (approve, deny, always) and UI bulk buttons. When an agent pauses for approval, the sidebar displays a real-time amber pulse indicator next to the affected chat — even if you’re currently viewing a different conversation. This eliminates the need to manually check each session for pending approvals:

Amber pulse dot: Agent is waiting for your approval/clarification
Green pulse dot: Agent is actively generating
No dot: Session is idle

The indicator survives page refresh (via server-side approval recovery) and syncs in real-time across all open browser tabs via SSE multiplex. When you resolve the approval, the indicator disappears immediately as the agent resumes.

Approval Timeout Race Protection

When an approval timeout fires and a user manually approves almost simultaneously, the system guarantees exactly-once execution via an idempotent resolve_if_first guard:

WebUI: Backend returns HTTP 409 → frontend shows a friendly toast and removes the stale approval card
IM Channels: Agent replies with a localized message (EN/ZH) informing the user the approval was already handled
Concurrent safety: Only the first resolver wins; all subsequent attempts are no-ops

This prevents double agent execution, duplicate LLM costs, and contradictory operations — a gap present in most competing frameworks.

Correction Learning

When you edit an approved action’s arguments or reject a tool call, the system automatically learns your preferences:

Zero LLM cost: Deterministic dict-diff classification (no additional inference calls)
Path preferences: Rejects or edits on file operations → remembered as workspace conventions
Command rules: Rejects on shell commands → added to procedural memory as permanent rules
Repetition tracking: Repeated rejections of similar patterns → auto-deny (stops asking)

Over time, approval prompts naturally decrease as the agent aligns with your working style.

Error Self-Healing

14-layer error recovery system automatically handles failures without user intervention. See Error Recovery for details. Key capabilities:

Stream interruption recovery (token-level precision)
Circuit breaker with 3-tier cooldown
Model fallback chains
Truncation auto-retry with progressive budget boost
Deterministic fallback (LLM-free safety net)

Authentication & Health Monitoring

Real-time monitoring of credential validity and system health, with automatic alerting:

Capability	What It Does
Auth Detector	Recognizes 15+ authentication failure patterns across all major providers (OpenAI, Anthropic, Google, etc.)
Circuit Breaker	Immediately stops retrying on permanent auth failures, saving tokens and preventing billing waste
Probe Policy	Intelligent recovery probing (60s for session expiry, 600s for permanent auth) to auto-detect when keys become valid again
Brute-Force Alert	Background monitor detects suspicious auth patterns (10+ failures/IP/hour), creates deduplicated system notifications
Health History	Records system health score every 3 minutes to database, retains 7-day trend for diagnosis
Real-time Push	Health status changes and memory metrics streamed to the frontend via dedicated ServerEventBus channels (separate from chat tool progress)
Deployment-Aware	Local mode skips network audit (saves resources), remote mode enables full monitoring

Prompt Injection Defense

Two complementary subsystems protect against prompt injection — one for untrusted external content flowing into the agent, and one guarding user/file inputs.

Content Boundary (Output-Side)

5-layer defense wrapping all external content and tool output before it enters the LLM context, covering the entire tool chain (built-in tools + third-party MCP tools + PTC built-in tools):

Layer	Technique	What It Catches
1	Unicode folding	Invisible Unicode characters used to smuggle payloads
2	Structural framing strip	Removes XML/HTML-like structural markers that mimic system tags
3	Marker sanitization	Replaces known boundary/delimiter patterns to prevent breakout
4	Random boundary	Wraps content in a cryptographically random delimiter (`===BOUNDARY_xxx===`) unpredictable to attackers
5	Pattern detection	Detects remaining injection patterns (role override, instruction override, system simulation)

Data returned by third-party MCP tools is automatically passed through all 5 defense layers before entering the LLM context, preventing malicious MCP servers from injecting instructions via tool outputs.

Prompt Guard (Input-Side)

113 detection patterns across 26 threat categories scan user messages, project rules, and skill files:

Anti-obfuscation: Leet speak reversal, invisible Unicode stripping, whitespace folding, Base64 decoding
Bilingual detection: English and Chinese prompt injection patterns (e.g., “忽略之前的指令”)
Two-pass detection: First on normalized text, then on Base64-decoded content

Sub-Agent Security

Multi-agent workflows introduce identity drift and privilege escalation risks. Myrm addresses this at every level:

Control	Mechanism
Tool Whitelisting	`DelegationCapabilityManifest` — parent explicitly declares which tools each child agent can access
Memory Isolation	3 policies: `EPHEMERAL_SESSION` (clean slate), `READ_ONLY_GLOBAL` (read but not write), `COLLABORATIVE_SESSION` (shared with audit)
Taint Propagation	`TaintTracker` labels flow from child → parent. If a child touches external network data, the parent session is automatically tainted
Sink Policies	Tainted sessions escalate to HITL approval for dangerous tool combinations (e.g., `EXTERNAL_NETWORK` taint + `bash_tool` = blocked without approval)
Budget Boundary	4-dimension `DelegationBudget` (token + USD + time + max descendants) prevents runaway sub-agent chains
Recursion Guard	5-layer progressive defense: L1 global depth hard limit (max 3) + L2 per-config depth (LEAF agents forced to depth=0) + L3 descendant budget (max 20 including parallel branches, atomic reserve) + L4 concurrency limits (semaphore=5 + per-agent children=5) + L5 LoopGuard 7-class behavioral detection (repetition/ping-pong/no-progress/divergence/diminishing). Rejections emit frontend STATUS events with actionable suggestions for LLM self-correction. Cascade cancellation of all descendants on parent abort

Skill Installation Security

When installing skills from any source (GitHub, SkillHub, file upload), every skill passes through a triple-layer security gate before activation:

Layer	Mechanism	What It Catches
Regex Scanner	113 patterns across 26 threat categories	Shell injection, credential theft, data exfiltration, obfuscated payloads
AST Analyzer	Python Abstract Syntax Tree structural analysis	Dangerous imports (`os.system`, `subprocess`), hidden function calls, suspicious code patterns
LLM Auditor	Semantic threat assessment by language model	Socially-engineered attacks, multi-step exfiltration chains, intent-masked malicious logic

Trust Levels

Skills are assigned one of four trust levels that gate their runtime capabilities:

Level	Source	Capabilities
TRUSTED	Built-in / official	Full tool access
INSTALLED	User-installed, scan passed	Scoped `allowed_tools` access
UNTRUSTED	Scan flagged warnings	Read-only tools, no network/filesystem
REJECTED	Scan critical findings	Quarantined, cannot execute

The quarantine_aware decorator automatically filters rejected skills at runtime — a quarantined skill simply disappears from the agent’s available tools without error.

GUI Security Review

Three frontend components work together to present security scan results, each finding includes precise line-number targeting (e.g., L42 Command injection: recursive delete) to help developers jump directly to problematic code:

Component	Trigger	Content
ScanConfirmDialog	Installing from SkillHub search	Finding list + severity badges + line numbers + confirm/cancel
Blocked Dialog	Enabling a skill with CRITICAL findings	Block reason + finding details + line numbers + “Force Enable” option
SecurityScanSection	Skill detail page	Full finding list grouped by severity + line numbers + security score (0-100)

The security score uses a 100-point system, deducting per finding: CRITICAL −25, HIGH −15, MEDIUM −8, LOW −3. A trust_recommendation (trusted/installed/untrusted/reject) is also generated to guide trust decisions.

MCP Tool Security

Tool Name Isolation

When multiple MCP servers are enabled, tool names can collide (e.g., both a GitHub and GitLab server expose search_repos). Myrm prefixes every MCP tool name with double-underscore delimiters:

mcp__{server}__{tool}

Unambiguous parsing — Unlike single-underscore schemes, __ delimiters allow exact reverse parsing even when server names contain underscores
Permission isolation — Prefixed MCP tool names never collide with built-in tools, preventing accidental permission bypass
Audit traceability — Every tool invocation log entry identifies the originating MCP server

SSRF Prevention

DNS Pinning prevents agents from being tricked into accessing internal networks via HTTP redirects:

Unified outbound HTTP layer: All agent-initiated HTTP exits (web_fetch, HTTP tools, OpenAPI executor, skill ZIP install, media resolver, robots/sitemap fetch, channel media download, Feishu attachments) converge on secure_fetch / async_pin_url — one implementation, no bare httpx blind spots
Manual redirect loop: follow_redirects=False with per-hop re-validation — every redirect target is fully checked before following
DNS Pinning: Resolved IPs replace hostnames in HTTP connections, eliminating DNS rebinding TOCTOU attacks
Comprehensive IP blocklist: RFC1918 private, CGNAT, link-local, multicast, reserved, cloud metadata endpoints (AWS/GCP/Alibaba/Tencent), plus IPv4-mapped IPv6 detection
Data exfiltration detection: 6 pattern categories (API keys, file paths, base64, JWT, secret keys, DB connection strings) prevent sensitive data leaking through URL parameters
Domain HITL approval: Non-allowlisted domains trigger human-in-the-loop approval when domainHitlEnabled is active — per-agent network allowlist configurable via UI
Parser-confusing character defense: Tab, newline, and backslash in URLs are blocked to prevent hostname extraction divergence between parsers (CVE-class SSRF bypass prevention)
Internal hostname suffix blocking: .local, .svc, .cluster.local, .home.arpa suffixes are blocked to prevent mDNS and Kubernetes internal network access
Audit trail: Blocked requests emit SSRF_BLOCKED security decisions for SIEM and frontend audit views
Agent API scope: /v1/chat/completions runs Myrm agents only (no raw LLM passthrough). User-configured Provider apiUrl SSRF checks remain on in-agent LLM calls — deploy-mode-aware: local mode allows loopback hosts (Ollama/vLLM), cloud/sandbox mode blocks private networks and cloud metadata endpoints
461+ dedicated SSRF tests: Coverage across core guards, agent security, browser navigation, DNS pinning, media validation, A2A resolver, web fetch, SessionVault, permission engine, and provider URL validation

Malicious URL Architectural Immunity

Instead of maintaining static phishing domain blocklists (e.g. 2.5M scam domains), Myrm eliminates threats at the architecture level:

SessionVault domain binding: Credentials (cookies/passwords) are strictly isolated per domain — bank.com login state is never sent to phishing domains like bank-secure-login.xyz, eliminating credential theft by design
Agent-level session isolation: Each configured agent gets its own physical SessionVault subdirectory — a “Work Assistant” and “Personal Assistant” accessing the same website maintain completely independent login states, preventing identity pollution across agents
Browser sandbox isolation: Agent browsers run in isolated sandboxes — even if a malicious site is visited, the user’s host system is unaffected
Four-layer deep domain filtering: CSP policy (kernel-level network restriction) + protocol interception (context.route blocks non-allowlisted domains) + main thread hardening (WebRTC/WebTransport/ServiceWorker blocking) + CDP audit monitoring

This architecture makes maintaining large phishing domain databases unnecessary — thousands of new domains appear daily, static lists become outdated rapidly, and consume 50-100MB memory.

Configuration Security Scan

Every MCP server configuration is scanned for 13 threat types before activation:

Threat Type	What It Catches
`prompt_injection`	Malicious system prompts embedded in config
`name_injection`	Tool names designed to mislead the LLM
`concealment`	Hidden instructions in descriptions
`exfiltration`	Data theft via outbound channels
`credential_harvesting`	Attempts to collect user secrets
`context_leak`	Leaking conversation context to external services
`arbitrary_execution`	Unrestricted code execution capabilities
`risky_profile`	Known high-risk server profiles
`suspicious_url`	Non-HTTPS or suspicious domain patterns
`sensitive_path`	Access to sensitive filesystem paths
`hardcoded_secret`	Credentials embedded in configuration
`supply_chain`	Dependency chain compromise indicators
`supply_chain_malware`	Known malicious package signatures

Findings are presented in the ScanConfirmDialog with severity badges. Users can trust, reject, or force-enable (at their own risk) each server.

Malicious Package Detection

When MCP tools install dependencies, the OSV (Open Source Vulnerability) API is consulted in real-time to detect known malicious packages.

Dynamic Tool Change Safety

When an MCP server adds or removes tools at runtime (via tools/list_changed notifications), Myrm ensures security without interrupting your workflow:

Automatic security vetting — Newly added tools are evaluated against the same 13-threat security scan applied at configuration time. If a tool fails the check, it is rejected and a warning is logged — no unsafe tool is ever silently activated
Prompt cache preservation — The agent’s prompt-facing tool list is frozen; dynamic changes only update the internal execution layer. This guarantees prompt prefix cache hits are never invalidated by external MCP server behavior
Zero user interruption — Unlike CLI-based competitors that require manual /reload confirmation, Myrm handles tool changes transparently. Users are never interrupted with confirmation dialogs for events they cannot meaningfully evaluate

Per-Agent Tool Filtering

Different agents can enable different MCP tools from the same server — a “Code Reviewer” agent sees only read-only tools, while a “DevOps” agent sees all. Tools with destructiveHint annotations are disabled by default in the safe set.

Operation-Level Semantic Risk Detection

Instead of labeling entire websites as “high risk” (a brittle, high-maintenance approach), Myrm detects risk at the individual operation level — every click, form submit, and command is analyzed in real-time.

7-Category Semantic DOM Risk Detection

When the agent clicks a button or link on any webpage, the element’s text is analyzed against 7 risk categories in both English and Chinese:

Category	Trigger Keywords (EN/ZH)	What Happens
Destructive	delete, remove / 删除, 移除	HITL approval required
Financial	pay, purchase, checkout / 付款, 购买	HITL approval required
Account	deactivate, close account / 注销, 关闭账号	HITL approval required
Admin	admin settings, permissions / 管理, 权限	HITL approval required
Publish	publish, post, submit / 发布, 发表	HITL approval required
Share	share, send, transfer / 分享, 发送	HITL approval required
Sensitive	password, credit card / 密码, 信用卡	HITL approval required

This approach is superior to domain-level risk tags because:

Zero maintenance: No need to maintain a list of “high-risk websites”
Universal coverage: Works on any website, including new ones
Granular: “View cart” on Amazon passes automatically, but “Place order” requires approval
No legal risk: No discrimination against specific platforms

Smart Intent Guard

An AI classifier (TranscriptClassifier) reviews tool calls to verify they align with the user’s original intent:

Reasoning-Blind: Only sees user messages + tool call sequences (not agent reasoning), preventing self-justification attacks
Deterministic: temperature=0 ensures identical inputs always produce identical verdicts
Structured output: Pydantic-enforced JSON with a reason field for audit traceability
Fail-safe: Errors or ambiguity fall back to HITL rather than auto-approving

Risk Governance System

A complete bidirectional risk detection and governance framework with built-in rules, custom rule management, and full-stack event handling. The same rule engine protects both WebUI input and IM channel inbound messages — no blind spots. 31 Built-in Rules across 7 categories detect sensitive data before it reaches the LLM:

Category	Examples	Count
Personal	Email, phone, Chinese ID number, passport, address	8
Security	API keys (OpenAI, AWS, GCP, Azure), SSH private keys, JWT tokens	7
Company	Internal IPs, employee IDs, project codenames, internal URLs	5
Customer	Customer IDs, order numbers, support ticket numbers	4
Finance & Legal	Bank accounts, credit cards, tax IDs, contract numbers	4
Political	Politically sensitive content patterns	3

Symmetric Inbound/Outbound Gate: IM channel messages pass through RiskDetectionService.detect() at the router level before reaching the Agent. Blocked messages receive a localized notification (6 languages) and are audit-logged. Outbound agent responses pass through the same engine via _apply_outbound_risk_gate — forming a closed-loop defense. GUI Rule Management: Full CRUD via WebUI settings panel — create custom rules with regex patterns, toggle rules on/off, batch operations, and rule testing before deployment. No code required. Audit Trail: Every risk hit records trace_id, session_id, matched rule, and severity level for compliance auditing. Full-Stack Event Loop: When input risk is detected, the server emits a risk_blocked SSE event. The frontend riskEvents handler intercepts this event and displays a user-friendly Toast notification explaining which rules were triggered — no silent failures.

Test Coverage

30,000+ tests verify the full security pipeline, including PII/DLP/privacy routing (1136), shell command approval (harness 1209 + server 261), semantic DOM risk (75), shell classification (379), SQL statement guard (68, 99.1% coverage), security engine integration (163+74), credential scanning (35), tool guards (9), permission engine (119), tool registry & inheritance (67), guardrail middleware (15), architecture registry (4), server permissions (18), agent builtin tools API (12), profile resolver (36), frontend approval & message (15), risk governance (117), webhook routes (2), dynamic authorization guardrails (845), MCP Elicit benchmark (166), and many more.

Shell Command Security

Commands are analyzed through a 5-layer quote-aware pipeline before execution:

Layer	Detection	Response
L1	Binary characters, 12 invisible Unicode categories (zero-width, direction overrides)	BLOCK
L1.5	ANSI-C quoting `$'...'` and locale quoting `$"..."`	BLOCK
L2	6 injection vectors (`$()`, backtick, `${}`, `;`, process substitution) + 70+ dangerous command patterns	BLOCK → DENY
L2.5	SQL syntax-level guard: detects destructive SQL in DB client commands (`psql`, `mysql`, `sqlite3`, etc.) — handles multi-statement injection and WITH CTE bypass vectors	ESCALATE → ASK
L3	Suspicious patterns (`curl\|sh`, `eval`, `base64 -d`, kill/pkill)	ESCALATE → ASK
L4	Recursive analysis of nested commands in `bash -c '...'`, `sh -lc '...'`, `zsh -c '...'`, `trap '...'` wrappers — depth-limited to prevent DoS	Recursive BLOCK/ESCALATE

Quote-aware preprocessing: A character-level state machine (_strip_quoted_content) replaces single-quoted content with placeholders before L2/L3 scanning, preventing false positives on echo 'rm -rf /' while still catching real threats in double-quoted or unquoted contexts. Privilege Escalation Floor: All sudo commands are unconditionally BLOCKED at L2 — including sudo apt install, sudo -S (stdin password piping), env sudo cmd, and bash -c 'sudo ...' (caught recursively by L4). This cannot be bypassed by YOLO mode, Smart Guard, or user approval. Competitors like Hermes only block sudo -S (the password-guessing vector), still allowing regular sudo with SUDO_PASSWORD injection — a broader attack surface. Six other competitors have no sudo guard at all. Auto-snapshot on destructive commands: Commands matching destructive patterns (rm, mv, git reset/clean/checkout/restore/apply with arbitrary flag prefixes, sed -i, redirect overwrite) trigger an automatic workspace snapshot before execution, ensuring full recovery regardless of outcome. SQL Guard (L2.5): Analyzes the original command (before quote stripping) to detect destructive SQL operations (DROP, DELETE, TRUNCATE, ALTER, INSERT, UPDATE, CREATE) embedded in database client commands. Defends against two advanced bypass vectors: multi-statement injection (SELECT 1; DROP TABLE) and WITH CTE wrapping (WITH x AS (...) DELETE FROM). Works with flag-based (psql -c '...') and pipe-based (echo '...' | psql) patterns. No competitor implements SQL-level command analysis for shell execution. Safe Exec path: Simple commands without shell metacharacters are executed via create_subprocess_exec (no shell interpreter), structurally eliminating $IFS, glob expansion, and command substitution injection classes.

AI Trajectory Classifier (Layer 5.5)

When static analysis returns ASK (uncertain), an AI classifier provides semantic-level judgment:

Reasoning-Blind design: Only sees user messages and tool call sequences — never assistant reasoning or tool outputs, preventing the agent from rationalizing dangerous actions past the classifier
Structured output: Pydantic schema forces JSON format (not free-text), reducing injection surface
Deterministic: temperature=0, max_tokens=200 for reproducible decisions
Fail-safe: Any error, timeout, or parse failure returns UNCERTAIN → falls back to HITL approval
Context-aware: Receives taint labels (EXTERNAL_NETWORK, SECRET) and trusted domains for nuanced judgment

Bilingual Command Explanation

Every command that triggers approval displays a human-readable bilingual explanation (humanize_command) generated deterministically — zero LLM calls, microsecond latency:

73 common commands covered with parameter-aware descriptions (extracts URLs, filenames, package names)
sudo prefix auto-tagged, dangerous pipe patterns highlighted
Users understand what the command does before approving, even without shell expertise
Entirely rule-based: cannot be manipulated via prompt injection unlike LLM-based alternatives

Install Slopcheck (Anti-Slopsquatting)

Before every pip install / npm install / yarn add / bun add command, a preflight check verifies that each package name actually exists on the public registry:

Aspect	Detail
Detection	HEAD probe against PyPI JSON API (`/pypi/{name}/json`) and npm registry (`/{name}`)
Normalization	PEP 503 rules (underscores, dots, mixed case → canonical lowercase-hyphen)
Concurrency	All packages probed in parallel via `asyncio.gather()`
Cache	In-memory set of verified packages — zero repeated network calls within a session
Private registries	Commands with `--index-url`, `--extra-index-url`, or `--registry` are auto-skipped
Graceful fallback	Network timeout or DNS failure → allow install (never block legitimate work)
Response	Unknown package → `ToolError` blocks the install and explains which packages were not found

This prevents slopsquatting attacks — where an attacker registers a package name that LLMs commonly hallucinate, embedding malware in the published package.

Encryption & Enterprise Network Compatibility

Scope	Standard
At Rest	AES-256-GCM for stored data (secrets, API keys, user content)
In Transit	TLS 1.3 for all network connections
API Keys	Encrypted secrets vault with per-user isolation
Memory	Optional encryption for sensitive memory entries
Incognito Mode	Physical isolation and read-after-burn for sensitive sessions

Enterprise TLS Compatibility

Corporate networks often deploy TLS inspection proxies (Zscaler, Netskope, Palo Alto Prisma) that can cause all HTTPS connections to fail. Myrm provides one-click enterprise network compatibility:

Settings → Advanced → Enterprise Network Compatibility, or set MYRM_TLS_STRICT=0
Precision relaxation of Python 3.13+ VERIFY_X509_STRICT flag — does NOT disable certificate verification
Custom CA bundle support: SSL_CERT_FILE (replace system trust store) or NODE_EXTRA_CA_CERTS (append to system trust store)
Per-MCP-server TLS: each MCP server can specify its own ssl_verify (true/false/custom CA path) and client_cert/client_key/client_key_password for full mTLS
4-layer auto-injection: infra (tls_compat.py) → server (tls_config.py) → MCP (client.py) → LLM (llm.py), covering 28+ HTTP client call sites
Automatic TLS error diagnosis: 8 error patterns detected, 5-language remediation hints
144 TLS-specific tests verified (38 TLS core + 31 MCP TLS + 75 error diagnostics)

Incognito Mode Deep Dive

One-click toggle in the message input area activates per-session privacy isolation:

Harness layer: IncognitoPolicy physically skips all writes to MEMORY and ARCHIVE context scenes
Server layer: Skips memory manager binding and all memory tools (memory_search_tool, memory_save, memory_manage), disables memory context injection, archive checkpoints, and session cleanup callbacks
Database layer: is_incognito flag ensures sessions are hidden from sidebar listing and excluded from full-text search
Self-hosted advantage: Unlike SaaS competitors that require a separate “local mode” toggle to keep data off vendor servers, Myrm’s self-hosted architecture means user data never leaves the machine by design — Incognito Mode adds session-level non-persistence on top

Credential Protection

Form Credential Vault

Passwords and TOTP seeds never enter the LLM context. You configure labeled credentials in Settings → Credentials; the agent only sees label names (e.g. github-personal) and calls fill_credential (browser and desktop use the same action). The Harness resolves the label in memory and injects at the DOM or OS input layer — plaintext never flows back into chat, tool args, or logs. Provider API keys (OpenAI, Anthropic, Gemini, etc.) are passed via LiteLLM’s api_key parameter — zero os.environ writes. Combined with process-level sandbox isolation, this architecturally eliminates cross-user credential leakage. 279 credential security tests passed.

What you get	Technical basis	Plain-language benefit
Label-only agent view	Tool schema exposes labels, not values	Safe to let the agent log in — it cannot “see” or repeat your password
Browser + desktop coverage	`fill_credential` (unified action) + password-field block on macOS/Windows/Linux	Same vault works for web apps and native desktop login
Built-in TOTP	RFC 6238 generation inside the vault	2FA flows without you reading codes aloud or pasting into chat
Encrypted storage	AES-256-GCM in Server DB, synced to Harness memory vault on startup; partial metadata edits preserve in-memory secrets	Credentials at rest are encrypted; editing description in Settings does not break live automation

Payment-card CVV has no dedicated use_payment_method API yet (unlike FSB’s browser extension). Password-type fields are covered; card checkout may need manual approval or future API.

Leak Detection

40+ regex patterns detect credentials in agent output:

API keys (OpenAI, Anthropic, AWS, GCP, Azure, etc.)
Database connection strings
JWT tokens and session IDs
SSH private keys
Entropy-based detection for unknown credential formats

PII Redaction & Privacy-Aware Routing

Myrm provides 8-layer PII defense with 57+ detection capabilities — the deepest privacy protection of any AI agent platform: Detection (3 engines, 57+ types):

Regex PII Scanner: 12+ structured types (phone, ID card, passport, bank card with Luhn validation, SSN, email, address, courier number, private IP, etc.)
LLM Semantic Scanner: 20+ non-structured types (medical health, political views, financial records, precise locations, biometrics) with PL2/PL3/PL4 classification
Credential Leak Scanner: 25+ secret patterns (AWS, OpenAI, Anthropic, GitHub, Slack, JWT, PEM keys, etc.) with Shannon entropy analysis

Protection (4 user-selectable modes):

Mode	Action	Use Case
WARN	Log detection, pass content through	Monitoring-only environments
REDACT	Type-aware irreversible masking (e.g., `138****5678`)	Production default for S2 data
PSEUDONYMIZE	Reversible placeholder replacement via SQLite-backed `PseudonymStore` with streaming-safe chunk-boundary restoration	When AI needs context but user sees originals
BLOCK	Completely block the message	S3-level confidential data

Privacy-Aware Model Routing: Automatically routes requests based on sensitivity level — S1 to cloud, S2 to cloud-after-redaction or local, S3 to local-only (data never leaves the machine). Configurable fallback: block or force-redact-then-cloud. GUI Configuration: Full privacy controls in Settings — enable/disable toggle, per-level action selection, deep scan toggle, local model connection test, custom keywords/regex/sensitive tools, and real-time test matching.

Summary Path Protection

When long conversations are compressed into structured summaries, PII and credentials can survive the summarization process. Myrm applies dual redaction (redact_leaks + redact_pii) to all summary fields before persistence — ensuring phone numbers, emails, API keys, and other sensitive data never persist in compressed conversation history.

Taint Tracking

TaintTracker follows the information flow of sensitive data through the agent’s execution, ensuring PII doesn’t leak through indirect channels (e.g., a tool reading a file containing credentials, then using that data in a web request).

Agent Export Security

When you export an agent configuration (for sharing or backup), credentials are automatically stripped:

What’s stripped	Where	Why
`api_key`, `bearer_token`, `client_secret`, `password`, `username`	`openapi_services[].auth`	Prevent API credential leaks in shared configs
`auth_token`	`tool_gateway_config`	Prevent gateway token leaks

Team agents export recursively — all member configs are included with credentials stripped. On import, team members are created atomically (all-or-nothing rollback).

The auth.type field is preserved so the importer knows which authentication method to configure (e.g. “api_key”, “bearer”, “oauth2”).

When sharing procedural memory rules (e.g., with teammates or the community), additional privacy layers are applied automatically:

Path anonymization — user home directory paths are replaced with <USER> placeholders
Credential redaction — API keys and secrets are truncated to safe prefixes (e.g., sk-pro...f456)
Metadata stripping — timestamps, update counts, and internal IDs are removed from exported rules

This ensures no personally identifiable file paths or credentials can leak through shared rules. See Memory System → Privacy-safe Rule Sharing for full details.

Agent Secret Management — Zero Plaintext Exposure

Per-agent secrets (custom API keys, tokens, environment variables) use a zero-plaintext-exposure architecture:

Security Property	How It Works
Frontend never receives values	`listAgentSecrets` API returns only key names (`string[]`), never values
No reveal endpoint	Unlike competitors that return plaintext on “reveal,” Myrm has no reveal pathway
Edit requires new value	Password input field, placeholder “Enter new value to overwrite”
Sentinel protection unnecessary	Since old values never reach the frontend, no ”****” round-trip risk exists
Encrypted at rest	AES-256-GCM via `DatabaseSecretBackend` with server-side master key
Atomic file writes	`LocalSecretBackend` uses tempfile + `os.replace` + `fsync` for crash safety
Log auto-redaction	`SensitiveDataFilter` replaces token/key/secret patterns with `*REDACTED*`
Agent isolation	Each agent can only access its own secrets via `agent_id` scoping

Competitors (e.g., Multica) return plaintext environment variables to the frontend and must rely on sentinel values (”****”) to prevent accidental overwrites. Myrm eliminates this entire attack surface by never exposing values.

Audit Trail

Structured Audit Trail

Every security decision is recorded in a structured audit log with Prometheus real-time metrics:

37 typed security decisions (ALLOW, DENY, ASK, SSRF_BLOCKED, PII_REDACTED, TAINT_ESCALATE, etc.)
Prometheus policy_denial_total counter for real-time anomaly detection
Session-scoped accumulator with TaintTracker cross-tool information-flow tracking
Cron job metadata automatically embeds the full security audit for post-run analysis

Event Types

37+ structured decision types cover the complete security lifecycle, including:

TOOL_CALL_START / TOOL_CALL_END
APPROVAL_REQUESTED / APPROVAL_GRANTED / APPROVAL_DENIED
MODEL_SWITCHED / FALLBACK_ACTIVATED
ITERATION_LIMIT_REACHED / BUDGET_EXHAUSTED
LOOP_DETECTED / CANCELLED
COMPRESSION_TRIGGERED / CHECKPOINT_CREATED

Profile Audit Engine

The Profile Audit Engine provides real-time configuration risk assessment for every Agent profile. When you open the Security tab in Agent settings, a Health Score Card instantly shows your agent’s security posture:

Score: 0–100 (higher = safer), deducted per finding severity
Risk Level: Safe / Low / Medium / High / Critical (5-level color-coded)
6-Dimension Grouped View: Findings grouped by checker dimension with per-dimension color coding — expand any dimension to see individual findings with severity borders, titles, and actionable recommendations
One-Click Fix: Policy gap findings include Fix (toggle switch) or Configure (navigate to settings section) buttons for immediate remediation

Six Detection Dimensions

Checker	Color	Detects
Tool Exposure	Orange	Dangerous built-in tool combinations (e.g., shell + file_write + MCP) and large tool surfaces
MCP Auth	Violet	MCP servers without authentication, insecure transport, or high scan findings
Skill Aggregate	Sky	Skills flagged as untrusted or rejected during import scanning
Subagent Risk	Rose	Multi-level delegation chains creating unauditable privilege paths
Cron Risk	Amber	Unattended scheduled tasks with high-privilege tools
Policy Gap	Slate	Missing security controls relative to enabled capabilities (with one-click fix)

Each dimension shows issue count or “Pass” status. Dimensions with issues can be expanded to reveal individual findings.

Design

Zero LLM: Pure deterministic rule engine — no token cost, instant response
Plugin architecture: Each checker implements BaseChecker and can be extended independently
Auto-refresh: Results update automatically when you save Agent configuration changes
Framework-level: Available to any project using the harness engine, not just the product UI
Test coverage: 37 backend unit tests + 15 frontend component tests (52 total)

Workspace Rules Security

Myrm automatically discovers and loads project-level rule files from 17 discovery points (13 root-level filenames + 4 subdirectory patterns), covering all major AI tool ecosystems:

Root files: .myrm.md, AGENTS.md, CLAUDE.md, SOUL.md, .cursorrules, .clinerules, .windsurfrules, and their case variants
Subdirectories: .myrm/rules/*.md, .cursor/rules/*.mdc, .claude/CLAUDE.md, .github/copilot-instructions.md
First-Match-Wins: Only the highest-priority file loads per directory, preventing conflicts
Zero-config migration: Users from Hermes (SOUL.md), Cline (.clinerules), Cursor (.cursorrules), Claude Code (CLAUDE.md), Windsurf (.windsurfrules) can bring their rule files unchanged

All discovered files are security-scanned before injection:

113 detection patterns across 26 threat categories
Anti-obfuscation: Leet speak, invisible Unicode, whitespace folding, Base64 decoding
Chinese injection detection: Supports CJK character-based prompt injection
Blocked content is replaced with a [BLOCKED] placeholder with structured metadata

Emergency Controls

Control	Description
E-Stop	One-click emergency stop that halts all running agents immediately
Session Kill	Terminate a specific agent session
Tool Blacklist	Dynamically block specific tools across all agents
Budget Override	Hard spending cap that overrides all agent budgets

Security Dashboard

A dedicated /security page provides full GUI visibility into the system’s security posture — no CLI commands needed:

Tab	What It Shows
Dependencies	Vulnerability alerts (Critical/High/Medium/Low counts) + Dependabot PRs + SBOM availability
Rate Limit	Real-time per-user/per-resource throttle status (current/max/remaining/window)
Audit Logs	Security event stream with multi-dimension filters (user ID, event type, result) + CSV/JSON export
Audit Stats	24-hour analytics: time series, top IPs, event distribution, success-vs-failure breakdown

Additional capabilities:

Setup Panel: Guided configuration of webhooks, monitored repos, and GitHub tokens
Multi-source: Pulls data from GitHub, Control Plane, or merged sources based on deployment mode
One-click refresh: Real-time data reload per tab
Export: Audit logs exportable in CSV or JSON format for SIEM integration

This GUI-first approach is superior to Hermes’ CLI-only security commands (hermes config view security, hermes pairing list) — users get the same information with visual analytics and no terminal expertise required.

Smart Approval (Auto Mode)

Smart Intent Guard is enabled by default for all new users. An auxiliary LLM (Transcript Classifier) reviews tool calls in real-time, enabling long unattended agent sessions without sacrificing safety. If no dedicated reviewer model is configured, Myrm automatically uses the user’s default model as a fallback — no additional setup required to benefit from intelligent approval.

Layer	What It Does	When It Fires
Deterministic Rules	Static permission engine evaluates against rulesets	Always (first)
CommandRiskLevel	Shell-pipe-aware classifier: SAFE (auto-allow) or UNKNOWN (needs review)	shell_exec with Auto Mode
Allowlist Memory	”Always Allow” user choice permanently auto-approves identical actions	ASK actions with prior user approval
Taint Escalation	Session contains PII/credential → even ALLOW tools get LLM review	ALLOW + tainted session
Transcript Classifier	Reasoning-blind LLM (sees only user messages + tool calls) → ALLOW/DENY/UNCERTAIN	ASK actions in Auto Mode
Outbound Check	Delegation actions force LLM review regardless of rule result	delegate_agent actions
Shell Escalation	Non-SAFE shell commands get LLM review even if rules say ALLOW	shell_exec + UNKNOWN risk
Threshold Breach	Consecutive denials → circuit-break auto-approval, revert to HITL	Too many denials

Fail-safe guarantees:

Classifier errors → fall back to HITL (never auto-approve on failure)
temperature=0 for deterministic, reproducible decisions
Pydantic-enforced structured output (no free-text manipulation)
Taint labels injected for context-aware judgment

Smart DENY → User Override (Once)

When the Transcript Classifier recommends DENY, Myrm does not silently reject the action. Instead, it shows a clear approval card:

Aspect	Behavior
Visual indicator	Amber warning box displaying the reviewer’s denial reason
Available actions	”Override Once” and “Reject” only (session/always/edit hidden)
Override behavior	Action executes this one time but is NOT added to the allowlist — next identical action still triggers review
Audit trail	`LLM_REVIEW_DENY_USER_OVERRIDE` event recorded with reason
Non-interactive sessions	Cron jobs and shadow agents auto-deny without override option (fail-closed)
High-risk paths	Taint escalation, outbound delegation, and shell escalation always hard-deny — no override possible (security red line)

This prevents “false positive lockout” where the AI reviewer misclassifies a legitimate action, while maintaining full audit visibility. vs competitors: Hermes offers smart_denied_for_owner with CLI-only once/deny buttons but no visual reason display, no audit logging, and no layered hard-deny for taint/outbound paths.

High-Risk Scenario: Always Allow Hidden

In 6 high-risk scenarios, the “Always Allow” button is automatically hidden from the approval card — users can only approve once or reject. This prevents accidental permanent bypass of security checks for dangerous operations:

Trigger	Why Always Allow Is Hidden
Taint Escalation	Session contains external network data — permanent allow would bypass all future taint checks
Outbound UNCERTAIN	AI reviewer uncertain about outbound delegation safety
Shell Escalation UNCERTAIN	Non-safe shell command under AI review uncertainty
LLM Review UNCERTAIN	General AI reviewer uncertainty about any tool call
Auto Mode Suspended	Consecutive denial threshold breached — system reverted to HITL
Shell Threat Detected	ShellCommandAnalyzer flagged a threat pattern

Normal, low-risk approvals still show “Always Allow” as usual. Users experience zero friction change for safe operations. vs competitors: Hermes hides permanent allow only for tirith content-security warnings (1 trigger). OpenClaw controls via backend allowedDecisions array with an “Always Allow Unavailable” warning message. Myrm covers 6 trigger scenarios — the most comprehensive in the industry.

Command Denylist (User-Defined Hard Floor)

Users can define glob patterns that permanently block specific commands — regardless of YOLO mode, Smart Intent Guard decisions, or permission rules. This provides a user-controlled safety net that cannot be bypassed by any approval mechanism.

Feature	How It Works
Glob Pattern Matching	Case-insensitive `fnmatch` matching (e.g., `git push --force`, `rm -rf /`, `DROP TABLE*`)
Global + Per-Agent	Set global deny patterns in Security Settings, and per-agent overrides in Agent Configuration
Union Merge	Per-agent denylist is merged (union) with the global denylist — agents can only add restrictions, never remove global ones
YOLO-Proof	Denied commands are blocked even in YOLO mode — this is a hard floor below all approval layers
GUI Editor	Visual pattern editor with add/remove, example placeholders, and validation — no CLI or YAML editing needed

The command denylist operates at Layer 2a.5 in the approval pipeline — after permission rules but before YOLO bypass, ensuring denied commands are always blocked. The Natural Language Policy Generator can also produce command denylist and network blocklist rules from plain language descriptions (e.g., “block all force push and database drop commands”) — the generated patterns are validated, previewed with human-readable explanations, and applied only after user confirmation.

Memory Write Trust Isolation

All memory writes pass through a unified security pipeline — even those triggered internally by the AI:

Guard	What It Prevents
Approval Queue	Implicit preferences extracted by Cognitive Deriver go through the standard approval queue (no bypass). External content cannot silently pollute your Profile
Security Scan	Every memory write runs through `scan_and_clean_memory` with injection detection (CLEAN/WARN/REDACTED/BLOCKED)
Preference Stability	Multi-observation lifecycle (Candidate → Provisional → Active) ensures only validated preferences reach your Profile — single-conversation flukes are filtered out
AGENT_SELF Priority Ceiling	Agent self-generated procedural rules are capped at HIGH priority — they cannot claim CRITICAL (compression-immune) slots, preventing prompt inflation
Profile Promotion	Core preferences (communication style, cognitive depth, proactivity) promote to Profile only after stability validation, via `set_system_profile_attribute` with security scanning

No competitor implements memory write trust isolation — most agents write extracted preferences directly to the user profile without approval, validation, or priority guardrails.

Security Profiles: One-Click Safety Modes

MyRM provides three built-in security profiles that users can switch between from the Settings UI:

Profile	Permissions	Use Case
Read Only	All writes denied (files, shell, browser, skills, cron). Reads auto-allowed.	Research, planning, code review
Workspace	File ops within allowed roots, shell requires approval	Normal development
Full Access	All operations allowed (YOLO mode)	Trusted local environments

Profiles are persisted to the database and survive server restarts. Each Agent can also have its own independent security configuration — enabling scenarios like a “research-only analyst” agent alongside a “full-access developer” agent.

Cron Job Execution Policy (Per-Job Least Privilege)

Scheduled Agent jobs support a dual-layer policy independent of the bound Agent profile:

Layer	What it controls	User benefit
Capability fence (`required_capabilities`)	PermissionEngine gates shell, file write, MCP, code, etc.	Block dangerous operations even if the Agent profile is broad
Tool scope (`tools_allowed`)	Narrows which builtin tools mount at Turn1 (intersected with Agent tools at runtime)	Smaller schema, prompt-cache friendly, least-privilege unattended runs

Where to edit: Settings → Scheduled Tasks → open a job → Execution policy editor in run history (same placement pattern as allowed filesystem roots). New jobs stay unrestricted by default. Built-in safeguards:

Fail-closed by default — jobs without explicit capability declarations deny dangerous operations (shell, code execution, MCP) automatically. A warning banner guides users to configure the capability fence or enable YOLO mode per-agent.
Preset packs (web-only / research / devops) dual-write capability fence + tool scope, aligned with blueprint SSOT.
Read-it-Later blueprint uses router mode (__wiki_source_sync__) with empty tools — deterministic server-side pull, zero LLM agent turns, no browser or code execution.
Lifecycle guard rejects cron prompts/commands/pre-flight probe scripts containing Myrm restart/stop or pkill myrm-agent patterns — prevents self-inflicted outage loops.
Restricted jobs do not silently gain baseline tools — a file-only cron will not auto-enable code_execute; unrestricted jobs still inherit the Agent baseline.

vs competitors: Hermes offers only a global cron_mode: deny/approve boolean toggle — all jobs share the same policy. OpenClaw/LobsterAI/CoPaw/deer-flow/jiuwenclaw have no cron capability at all. Myrm provides per-job granular capability declarations with GUI editing, blueprint auto-fill, and fail-closed defaults.

Subagent Recursive Isolation: 5-Layer Defense

When agents delegate tasks to subagents, a common failure mode is uncontrolled recursive spawning — subagents spawning more subagents until token budgets are exhausted. MyRM enforces 5 layers of isolation:

Layer	Mechanism
L0	Type admission whitelist (only catalog-registered agent types)
L1	Global blocklist (7 orchestration tools + 2 privileged tools stripped from LEAF agents)
L2	Config-level blocklist + readonly mode additional blocks
L3	Child ⊆ Parent tool intersection (child can never have more tools than parent)
L4	Dual depth limits (global max 3 + per-config max_spawn_depth)

Additional safety nets: payload hash deduplication prevents delegation loops, result caching avoids redundant work, and 3 memory isolation strategies (EPHEMERAL / READ_ONLY_GLOBAL / COLLABORATIVE_SESSION) prevent cross-agent data contamination.

Multi-Agent File Protection: 8-Layer Defense

When multiple agents work in the same workspace simultaneously, file conflicts are the #1 source of data corruption. MyRM enforces 8 layers of defense — all code-enforced, zero Prompt reliance:

Layer	Mechanism	What It Prevents
L1	Read-Before-Write (staleness_guard)	Blind edits on unread files — agent must read a file before writing
L2	Version Match (file_integrity_guard)	Stale edits — rejects writes when file content changed since last read
L3	Full-Read-Before-Edit (file_integrity_guard)	Partial reads — rejects edits if only part of a file was read
L4	Line-Level Conflict Detection (file_conflict_guard)	Overlapping edits — detects when two agents modify the same line range
L5	Cross-Agent Activity Tracking (file_activity_tracker)	Untracked concurrent access — records every agent’s write activity per file
L6	Automatic Workspace Isolation (workspace_policy)	Shared-state corruption — auto-upgrades to ISOLATED_COPY when multiple agents write in parallel
L7	Deferred Serial Merge (batch_merge)	Merge conflicts — isolated workspaces merge back one-at-a-time
L8	Auto-Cleanup on Completion (subagent_manager)	Stale tracking data — clears all tracking records when a subagent finishes

How it works in practice:

A parent agent delegates 3 coding tasks to parallel subagents
workspace_policy detects ≥2 parallel writers → auto-upgrades to ISOLATED_COPY
Each subagent gets its own COW (Copy-on-Write) workspace clone
Subagents work independently — no lock contention, no blocking
On completion, batch_merge applies changes back to the parent workspace one-at-a-time
If two subagents edited overlapping lines, file_conflict_guard raises a conflict at merge time

vs competitors: AWS Codex Agent Team relies on Prompt instructions (“no two active tasks may write the same file”) — LLMs can ignore this constraint. MyRM enforces every layer in application code — impossible to bypass regardless of model behavior.

vs CaMeL Guard (hermes-agent-camel)

The CaMeL Guard project implements a research-based trust boundary model (trusted controller vs untrusted data). Myrm’s security architecture provides significantly deeper coverage:

Dimension	CaMeL Guard	Myrm
Defense layers	1 (trust boundary)	6 (onion defense-in-depth)
Loop detection	Simple threshold (warn after N, block after M)	5 domain-specific detectors with targeted suggestions (bash/browser/file/web/memory)
Error classification	Single-level FailoverReason enum	3-layer system (Recoverability → FailoverReason → ProbePolicy)
Credential pool	2 strategies (fill_first/round_robin)	4 strategies + exponential backoff + jitter anti-stampede
Path security	Allowlist-based path checking	PTC process-level sandbox isolation
Context management	Single-file compressor	20+ module pipeline (cache healer + anti-thrashing + session notes)

Skill Evolution Work Units

​Security Architecture

​Security Layers

​Approval Modes

​Session Security Presets

​10-Layer Progressive Approval Architecture

​Multi-Platform Approval UX

​Sidebar Attention Indicator

​Approval Timeout Race Protection

​Correction Learning

​Error Self-Healing

​Authentication & Health Monitoring

​Prompt Injection Defense

​Content Boundary (Output-Side)

​Prompt Guard (Input-Side)

​Sub-Agent Security

​Skill Installation Security

​Trust Levels

​GUI Security Review

​MCP Tool Security

​Tool Name Isolation

​SSRF Prevention

​Malicious URL Architectural Immunity

​Configuration Security Scan

​Malicious Package Detection

​Dynamic Tool Change Safety

​Per-Agent Tool Filtering

​Operation-Level Semantic Risk Detection

​7-Category Semantic DOM Risk Detection

​Smart Intent Guard

​Risk Governance System

​Test Coverage

​Shell Command Security

​AI Trajectory Classifier (Layer 5.5)

​Bilingual Command Explanation

​Install Slopcheck (Anti-Slopsquatting)

​Encryption & Enterprise Network Compatibility

​Enterprise TLS Compatibility

​Incognito Mode Deep Dive

​Credential Protection

​Form Credential Vault

​Leak Detection

​PII Redaction & Privacy-Aware Routing

​Summary Path Protection

​Taint Tracking

​Agent Export Security

​Privacy-safe Rule Sharing

​Agent Secret Management — Zero Plaintext Exposure

​Audit Trail

​Structured Audit Trail

​Event Types

​Profile Audit Engine

​Six Detection Dimensions

​Design

​Workspace Rules Security

​Emergency Controls

​Security Dashboard

​Smart Approval (Auto Mode)

​Smart DENY → User Override (Once)

​High-Risk Scenario: Always Allow Hidden

​Command Denylist (User-Defined Hard Floor)

​Memory Write Trust Isolation

​Security Profiles: One-Click Safety Modes

​Cron Job Execution Policy (Per-Job Least Privilege)

​Subagent Recursive Isolation: 5-Layer Defense

​Multi-Agent File Protection: 8-Layer Defense

​vs CaMeL Guard (hermes-agent-camel)

Security Architecture

Security Layers

Approval Modes

Session Security Presets

10-Layer Progressive Approval Architecture

Multi-Platform Approval UX

Sidebar Attention Indicator

Approval Timeout Race Protection

Correction Learning

Error Self-Healing

Authentication & Health Monitoring

Prompt Injection Defense

Content Boundary (Output-Side)

Prompt Guard (Input-Side)

Sub-Agent Security

Skill Installation Security

Trust Levels

GUI Security Review

MCP Tool Security

Tool Name Isolation

SSRF Prevention

Malicious URL Architectural Immunity

Configuration Security Scan

Malicious Package Detection

Dynamic Tool Change Safety

Per-Agent Tool Filtering

Operation-Level Semantic Risk Detection

7-Category Semantic DOM Risk Detection

Smart Intent Guard

Risk Governance System

Test Coverage

Shell Command Security

AI Trajectory Classifier (Layer 5.5)

Bilingual Command Explanation

Install Slopcheck (Anti-Slopsquatting)

Encryption & Enterprise Network Compatibility

Enterprise TLS Compatibility

Incognito Mode Deep Dive

Credential Protection

Form Credential Vault

Leak Detection

PII Redaction & Privacy-Aware Routing

Summary Path Protection

Taint Tracking

Agent Export Security

Privacy-safe Rule Sharing

Agent Secret Management — Zero Plaintext Exposure

Audit Trail

Structured Audit Trail

Event Types

Profile Audit Engine

Six Detection Dimensions

Design

Workspace Rules Security

Emergency Controls

Security Dashboard

Smart Approval (Auto Mode)

Smart DENY → User Override (Once)

High-Risk Scenario: Always Allow Hidden

Command Denylist (User-Defined Hard Floor)

Memory Write Trust Isolation

Security Profiles: One-Click Safety Modes

Cron Job Execution Policy (Per-Job Least Privilege)

Subagent Recursive Isolation: 5-Layer Defense

Multi-Agent File Protection: 8-Layer Defense

vs CaMeL Guard (hermes-agent-camel)