Skill Evolution

Agents autonomously learn, test, and deploy new skills through a 42-module native evolution system — no external dependencies, no CLI wrappers, no AGPL risks.

How It Works

The evolution engine follows a Retrieve-Before-Generate strategy: before creating new solutions, it searches existing high-confidence fixes. This alone eliminates redundant LLM calls.

Discovery — Agent identifies a recurring task pattern or detects a failure signal
Evidence Aggregation — Collects success and failure cases across multiple executions
Variant Generation — Creates 3 candidate variants in parallel (cost-efficient vs competitors’ 50-500 LLM calls)
Scoring — LLM-as-Judge evaluation with four dimensions (Function / Quality / Safety / Compatibility)
Improvement Gate — The original skill competes as a baseline candidate; only variants that genuinely outperform it survive
Approval — GUI-based human approval workflow with diff preview
Deployment — A/B tested with automatic rollback on regression

Directed Skill Learning: `/learn`

In addition to automatic skill extraction, you can teach the agent new skills on demand using the /learn command in any chat:

Input	What Happens
`/learn https://docs.stripe.com/webhooks`	Fetches the page, extracts procedures and config, creates a reusable skill
`/learn ./scripts/deploy.sh`	Reads the file, analyzes the workflow, distills it into a SKILL.md
`/learn the deployment workflow we just did`	Reviews the conversation history and captures the steps as a skill
`/learn` (no arguments)	Automatically reviews the current conversation and distills it into a skill

The /learn command works across all channels — WebUI, Telegram, and any connected IM platform. The generated skill follows strict authoring standards (frontmatter, 8-section structure, quality bar) and is saved through the same security pipeline as auto-captured skills. vs competitors: Hermes /learn requires arguments (errors on empty input) and only works in CLI. Myrm’s /learn handles empty arguments gracefully, works across all channels, and feeds into the full quality pipeline (10-dim scoring, dedup, lifecycle management).

Learning by Demonstration

Show the agent a “before → after” example and it automatically infers the transformation rule:

Upload your original file and the expected result (Excel, CSV, text, code — any format)
Agent analyzes the differences, infers the transformation logic, and writes a script
Validates by executing the script and comparing output against your expected result
Captures the validated pattern as a reusable skill via StructuredExtractor

The entire flow happens within a natural conversation — no wizard, no template editor, no programming required. The StructuredExtractor automatically classifies whether the pattern deserves to be a skill, cron_job, or is too trivial to capture (skip), with built-in safety_analysis to detect destructive commands or credential exposure. vs competitors: All competitors require manual template authoring or macro recording. Myrm’s approach is fully conversational and leverages LLM understanding — it generalizes from examples rather than replaying exact steps.

Learning from Human Takeover

When the Agent gets stuck during a browser task, it can ask you to take over. Your actions then become learning material — automatically and transparently:

Agent requests takeover — You see a prompt asking for control
You operate the browser — Navigate, click, fill forms as needed
System captures the change — Pre/post page snapshots are recorded (semantic accessibility tree, not pixels)
Toast confirmation — You see “Behavior recorded — Agent will learn from your demonstration”
Evolution consumes the evidence — Next time the FIX or DERIVED pipeline triggers, your demonstration appears alongside execution traces

This happens at zero extra cost (no LLM calls during recording) and zero interference to your workflow. The Agent does not attempt to replay your exact clicks — instead, the evolution LLM receives a high-level “before → after” page state to understand what the correct outcome should look like. vs competitors: No competitor (OpenClaw, Hermes, CoPaw, DeerFlow, or LobsterAI) captures human takeover actions for skill learning. When their agents get stuck, user interventions are simply discarded.

Automatic Multi-Step Workflow Learning

Every time you complete a task, the agent automatically analyzes the full conversation — including all tool calls, their parameters, ordering, and context — to identify reusable multi-step patterns. If a workflow is generalizable, it’s captured as a new skill on the first occurrence.

Aspect	Myrm	OpenSquilla (competitor)
Detection trigger	First completed session	Requires 3+ identical sessions
Input signal	Full conversation trajectory (context-rich)	Only skill names + frequency count
Output format	Flexible SKILL.md (Agent adapts at runtime)	Fixed DAG (rigid step ordering)
Validation	SandboxValidator dry-run + safety pipeline	Static lint + smoke test
New architecture required	None (reuses CAPTURED pipeline)	Entire DAG engine (6+ modules)

Why this matters: You don’t have to repeat yourself three times before the agent “notices” a pattern. One well-executed workflow is enough — the agent learns immediately and adapts the knowledge to future variations.

Four Evolution Types

Type	Trigger	What It Does
FIX	3 consecutive failures or success rate below 50%	Auto-repair failed skills using trace analysis
DERIVED	User feedback or frustration signals	Optimize skill based on explicit or implicit feedback
CAPTURED	Session anti-patterns detected during idle	Background extraction → 10-dim scoring → dedup → safety scan → Insights Inbox approval
OPTIMIZE_DESCRIPTION	Low match rate	Refine skill description for better semantic matching

5-Layer Safety

Every evolved skill passes through five independent safety gates before reaching production:

Layer	Mechanism	What It Prevents
Sandbox	Isolated execution environment	Runtime failures and side effects
AST Signature	Function signature integrity check	Core API breakage
Size Guard	Maximum 120% growth ratio	Code bloat and complexity creep
Anti-Loop	TTL + attempt limits per skill	Infinite evolution cycles
A/B Testing	Side-by-side performance comparison	Silent regressions

Frustration Detection

The system detects user dissatisfaction through 5 categories and 38 bilingual patterns (Chinese + English), triggering DERIVED evolution without explicit user feedback:

Verbosity — “just give me the answer”
Style — “be more concise”
Format — “use a table instead”
Workflow — “stop doing X first”
General — frustration expressions

Evidence-Driven Evolution

Unlike competitors that evolve from single failure signals, Myrm aggregates evidence across multiple executions:

Success cases — preserved to prevent regressions
Failure cases — analyzed for root cause patterns
Minimum evidence threshold — requires at least 3 executions and 1 failure before triggering evolution

Smallest Appropriate Form

When the agent captures a behavior pattern, it doesn’t blindly create a skill. Instead, it classifies the smallest appropriate execution form in the same LLM call — zero extra cost:

Classification	What Happens	Example
skill	Normal skill approval flow	”Deploy with zero-downtime rolling update”
cron_job	Purple card with schedule suggestion	”Check server load every Monday at 9am”
skip	Silently discarded — too trivial to capture	”Rename a file”

This means fewer noisy proposals in your inbox, and clear guidance when something is better suited as a scheduled task rather than an on-demand skill. vs competitors: No competitor (Hermes, OpenClaw, Codex) offers automatic form classification. All produce only skills, leaving users to manually create cron jobs.

Proactive Skill Recommendation

When the agent encounters a request it cannot fulfill with current tools, it automatically searches the Skill Hub before declining. This “search before saying no” behavior is enforced at two levels:

System Rules — A global behavior rule instructs all models to check for discovery tools before declining
Tool Description — The discovery tool itself uses MUST-level instructions

What this means for you: Your AI assistant never simply says “I can’t do that.” Instead, it searches available skills and suggests ones you can install with one click. vs competitors: Verified across 8 competitor codebases (Hermes, OpenClaw, CoPaw, LobsterAI, CodePilot, jiuwenclaw, deer-flow) — none implement proactive skill recommendation. Hermes’s tool_search (735 lines) is Progressive Tool Disclosure for already-registered tools, not capability discovery.

Quality Monitoring

3-dimensional degradation detection with sliding window statistics:

Success rate monitoring with configurable threshold (default: 70%)
P95 latency tracking with automatic alerting
Server error rate (5xx) monitoring

Skills that degrade are automatically isolated via 1-Strike (critical failure) or 3-Strikes (gradual degradation) policies.

Stop-Loss: Versions, Shadow A/B, and Batch Snapshots

Plain language: If an evolved skill makes your agent worse, you can roll back in Settings — no CLI. You can also approve changes in Shadow mode first (production behavior stays on the old version while we compare in the background), then promote when metrics look good. Batch-optimizing many skills saves a snapshot before it starts; you can cancel and roll back mid-run if results go sideways. Verified (2026-06-06): 34 automated tests cover batch cancel/rollback (including multi-skill, partial-failure HTTP disk restore, and await-timeout skip-rollback), scheduler cancel_batch_optimization + await_batch_optimization wiring, and real disk restore for restore_skill_snapshot (both DB branches). Cancel stops in-flight batch optimization via harness cancellation tokens before rollback restores snapshots; if await times out, rollback is skipped and the user is directed to the detail page. Cancel-with-rollback returns the same rolled_back/failed/total_skills/error_message stats as the batch rollback endpoint; the Web UI shows success, partial (cancelRollbackPartial / rollbackPartial), or failure toasts. Full GUI click-through E2E requires a running backend.

Capability	What you get
Versions panel	One-click rollback to any saved snapshot (GUI)
Evolution history	Full audit trail of all approved/rejected changes with inline Diff viewer
Inline revision	Monaco DiffEditor for editing proposals in place (responsive: side-by-side on desktop, inline on mobile)
Approve + Shadow	Growth inbox → shadow test → Guardian promote/stop
Batch optimization	Auto snapshot on submit; cancel with optional full rollback

vs competitors: Hermes/OpenClaw offer CLI or file-based rollback; none ship a full GUI loop for shadow validation plus batch mid-flight rollback.

Known Pitfalls: Auto-Learning from Failures

Myrm automatically extracts structured “pitfalls” from skill execution failures and feeds them back into future prompts — your agent avoids the same mistake twice.

Capability	What you get
Auto-extraction	Deterministic errors are automatically captured as structured traps (severity + trigger condition + mitigation)
Deduplication	Same error increments `occurrence_count` instead of creating duplicates
Prompt injection	High-severity traps are injected into evolution prompts as “Known Traps (avoid these pitfalls)” constraints
GUI display	`KnownPitfallsSection` renders traps with 4-level color-coded severity, trigger conditions, and mitigation steps
Template library	8 common trap templates (npm timeout, API rate limit, file permissions, memory overflow, encoding, Python version, Docker, Git conflicts)

Plain language: When a skill fails due to a known issue (e.g., npm timeout), Myrm remembers it and explicitly warns the AI on the next evolution attempt. Over time, your skills become more robust automatically — no manual documentation needed. vs competitors: Hermes includes a plain-text “Pitfalls” paragraph in skills. Myrm provides structured storage with severity levels, automatic extraction from runtime errors, deduplication with occurrence counting, and prompt engineering injection — a complete closed-loop learning system.

Decision History: Dual-Loop Learning

Every evolution outcome — success or failure — is permanently recorded and automatically fed into future evolution attempts. The agent accumulates institutional knowledge over time, remembering both what worked and what didn’t.

Source	What is recorded	When
FIX success	Error description + fix reasoning	After successful auto-repair (score ≥ 0.7)
User-driven optimization	User feedback + optimization reasoning	After successful DERIVED evolution (score ≥ 0.7)
Evidence-driven repair	Aggregated failure patterns + fix reasoning	After successful evidence-based evolution (score ≥ 0.7)
Screener rejection	Rejection reasoning with confidence score	When an evolution proposal is rejected (confidence ≥ 0.7)
Manual rejection	Human reviewer’s rejection reasoning	When a user rejects a proposed change in the GUI

How it works: The 5 most recent constraints are automatically injected into the LLM prompt as “Historical Constraints (MUST obey or rejection is guaranteed)” — so the next evolution attempt builds on accumulated wisdom instead of starting from scratch. Plain language: Your agent gets smarter with each evolution cycle. If a fix worked well, it remembers why. If a proposal was rejected, it won’t try the same approach again. Over months of use, your skills accumulate a rich history of lessons learned — all automatic, zero maintenance. vs competitors: SkillHone (Tencent) uses a Git server (Forgejo) with Issues/PRs/Wiki to record decision history — requiring docker compose setup and manual browsing. Myrm stores everything in SQLite with zero external dependencies and automatically injects lessons into future prompts. No other competitor offers dual-loop (success + failure) persistent learning.

Curator: Lifecycle Governance

Automated skill lifecycle management through the Curator system:

Real usage-driven decisions — Skill select / [use skill] writes .stats.json; Curator sweeps use actual call data, not empty sidecars
Zero background token cost — Stale/archive/LRU are deterministic; consolidation (umbrella merge) is off by default until you opt in
Prebuilt immunity — 45+ built-in skills under /prebuilt/ are never auto-archived
Dual one-click sweep — Settings “Run now” + Agent config Radar “Smart Prune” with toast feedback
Never-used protection — Grace period + stale threshold; newly installed skills are not instantly marked idle
Cluster detection — Prefix + embedding finds semantically similar skills
Umbrella merge — GUI preview + confirm; no accidental consolidation
Auto recovery — Stale skills return to active on next use
Hermes migration usage preserved — Imported call counts and last-used timestamps survive; Curator does not false-stale active skills
History visualization — GUI timeline; every sweep is traceable
Single hygiene SSOT — Stale/archive cleanup runs only via Settings Curator and Agent Smart Prune; chat discover does not mount a read-only analyze tool (7/7 peer repos have no equivalent Agent tool)

Conversational Skill Management

Install, search, and manage skills directly from the chat — no settings page needed.

Chat Commands

Simply tell the agent what you need in natural language:

What You Say	What Happens
”Help me find a LinkedIn scraping skill”	Agent searches 7 sources, shows results with security scan summaries
”Install this skill from GitHub”	Agent clones, scans, quarantine-isolates, and installs with one confirmation
”Uninstall the web-scraper skill”	Agent removes the skill and cleans up dependencies
”What tools do I have for data analysis?”	Unified capability discovery (BM25 + semantic hybrid search) across native tools and installed skills

7-Source Skill Marketplace

Skills are discoverable from multiple ecosystem sources — not locked to a single repository:

GitHub — public skill repositories
ClawHub — community skill hub
LobeHub — LobeChat ecosystem skills
skills.sh — curated skill registry
ModelScope — 80K+ Chinese AI ecosystem
Aliyun — AgentExplorer marketplace
Prebuilt — 45+ bundled skills

Security-First Installation

Every skill passes through a security pipeline before activation:

Quarantine isolation — installed in a sandboxed directory
AST analysis — static code analysis for dangerous patterns (113 rules)
LLM semantic audit — AI reviews the skill’s intent and behavior
Pattern scanning — regex-based detection of known malicious patterns
Trust scoring — initial trust level with time-based attenuation

GUI Management (33 Components)

Beyond chat commands, a full visual management system is available in Settings:

Discover & search — filterable skill marketplace with categories
Import — URL import, batch import, file upload
Export & share — ZIP packaging with SHA-256 integrity and PII auto-redaction
Permissions — granular permission approval and usage monitoring
Versions — full version history with one-click rollback
Evolution tracking — pending proposals, rejection audit, quality monitoring
Sync — cross-device synchronization via iCloud/Dropbox/NAS

vs competitors: BrowserAct/OpenClaw require manual git clone and YAML configuration. No competitor offers in-chat skill search, security-scanned installation, or a 33-component visual management system.

Skill Library

45+ prebuilt skills available out of the box, with multi-source marketplace integration (GitHub, ClawHub, LobeHub, skills.sh, ModelScope). Includes specialized skills for Obsidian Canvas, Obsidian Bases, data analysis, deep research, Unreal Engine MCP (scene building, Blueprint authoring, lighting), Blender MCP (mesh creation, materials, animation), and more — each with structured contracts (steps, traps, verification).

Toolset-Aware Skills

Skills automatically adapt to your deployment environment. Each skill can declare which tools or tool groups it requires:

requires_tools / requires_tool_groups — skill only appears when specific tools are available
fallback_for_tools / fallback_for_tool_groups — skill auto-activates as a tutorial/workaround when a native tool is missing

8 standardized tool groups (web, browser, file_ops, shell, computer_use, memory, kanban, wiki) ensure consistent behavior across Local WebUI, Tauri Desktop, and SaaS Cloud deployments.

Zero-Roundtrip Skill Injection

When you explicitly invoke a skill (e.g., “use code-review”), Myrm injects the full SOP directly into context — no extra LLM tool call needed. This saves 2-5 seconds and 500-2000 tokens per invocation compared to the traditional “request → LLM calls select tool → load SOP” flow. The injected payload includes:

Full SOP content with ${SKILL_DIR} template variables resolved
[Skill directory: /path/to/skill] for file access
Auxiliary file listing (scripts, references, templates)
[IMPORTANT: The user has invoked...] strong-signal header for model compliance

Three-Way Hash Protection

When Myrm upgrades bundled prebuilt skills, user customizations are never silently overwritten:

Scenario	What Happens
User hasn’t modified the skill	Upstream update applied silently
User modified the skill & upstream changed	User version preserved; “Update Available” badge shown in GUI
User accepts the upstream update	One-click apply via Accept Upstream button
User rejects	Skill stays as-is; badge dismissed

Under the hood, origin_hash (SHA-256 of the bundled source at last sync) is compared against the current stored content hash. If they differ, the user has customized the skill and the upgrade is deferred — not forced. This solves a common problem with prebuilt/template systems: users who tweak defaults lose their changes on every update.

Skill Config Hot Reload

When you edit skills, agent bindings, or related settings in the WebUI, changes take effect on the next chat message — no server restart and no manual reload command. What happens under the hood:

The settings API bumps a config version stored as a small file under your data directory (MYRM_DATA_DIR/.skill_config_version).
Before handling your next message, the agent compares its cached version against that file.
If stale, it clears the skill loader cache and re-initializes only what changed.

Why Volume persistence matters: In cloud sandbox or multi-worker deployments (Granian), every process reads the same version file on shared storage. OpenClaw keeps a similar counter in process memory only — workers can drift. Hermes requires a manual /reload-skills CLI step. Prompt cache: Re-init happens only when skills actually change, so unrelated conversations keep their cache benefits.

6 Concurrent Self-Evolution Mechanisms

Myrm doesn’t wait for you to notice problems — six independent background mechanisms keep your agent learning 24/7:

Mechanism	What It Does	Trigger
MetricMonitor	Scans for high-error-rate skills	Every 5 seconds
Sliding Window	Extracts experience from tool call traces	Every 15 calls
SkillImmune	Auto-quarantines failing skills, triggers repair	Runtime failure
Session-End	Analyzes completed conversations for learnings	After each session
FrustrationDetector	Detects user dissatisfaction in 38 bilingual patterns	Real-time
EvidenceEvolution	Cross-session evidence aggregation	Every hour

All proposals flow through the approval pipeline before taking effect — your agent learns constantly but never changes without your consent.

Growth Dashboard

Track your agent’s learning progress with a comprehensive 5-tab visualization system: Overview — Full-picture at a glance

4 KPI cards — total memories (with weekly delta), skills learned (with evolution count), active days (with streak), memory health score
Cost savings card — cache savings + routing savings + total cost
84-day activity heatmap — GitHub-style visualization of usage patterns
Multi-dimensional health radar — instant view of memory system health
Weekly summary — conversations, messages, cron executions, tool calls with week-over-week delta arrows
Skill evolution timeline — recent proposals with 7 statuses (pending review/auto-applied/failed scan/blocked/approved/rejected/apply failed)

Evolution — AI behavior pattern digest and evolution approval inbox Graph — 2D force-directed Claim/Evidence semantic relation graph (4 relation types + search/filter/focus/fullscreen/legend) Trends — Skill usage efficiency trends (success rate, avg duration, call frequency) Daily — Per-day agent activity consolidated view

Daily Work Journal

The Daily Journal tab provides a consolidated view of everything your agent did on any given day:

Overview metrics — sessions, tokens, cost, tool calls, approvals, cron runs, kanban events
Source breakdown — sessions grouped by origin channel (Web UI, Telegram, API, etc.)
Unified timeline — all events (sessions, approvals, cron runs, kanban events) sorted chronologically
Date navigation — browse any past day with a date picker
Agent filtering — filter by specific agent when running multiple agents

Zero new storage is required — the journal aggregates data from 6 existing sources (Chat, Message, ApprovalRecord, CronRunModel, KanbanTaskEventModel, EventLog) in real time. In a multi-agent sandbox, skills are perfectly isolated yet securely shareable.

Scope Isolation: Skills natively belong to the agent that learned them. They don’t pollute a global pool like in OpenClaw or Hermes.
Cross-Agent Mounting: Users can mount a skill from Agent A to Agent B with a single click in the GUI, complete with a visual origin badge.
Copy-on-Write (CoW) Forking: If Agent B evolves a skill mounted from Agent A, Myrm automatically forks a localized variant for Agent B. Agent A’s original skill remains untouched and pristine.
1-Click Rollback: Undo an evolution on a CoW fork, and the system intelligently restores the cross-agent mount mapping, retaining complete semantic history.
Robust Garbage Collection: Deleting an agent cascade-deletes its exclusive skills (both database records and physical rmtree wipe with path boundary protection), leaving zero orphan data or “ghost” skills.

Zero-Overhead Global Registry

Unlike competitors that require dedicated database tables (shared_capability + shared_capability_version + agent_capability_binding) and a materialization layer to achieve “maintain shared capabilities once,” Myrm’s architecture makes this free:

Property	How It Works
Single-instance by default	Skills exist once on the filesystem or database; agents reference them by ID
Per-agent configuration	`SkillInstanceConfig` allows different env/config overrides per agent without duplicating the skill
CompositeSkillBackend routing	Multiple skill sources (prebuilt, local, MCP) compose transparently with longest-prefix-match routing
Automatic deduplication	”Last wins” strategy ensures user skills override bundled ones without manual conflict resolution
Dependency auto-check	`SkillRequires` (bins, env, config) + `requires_tools` + `required_credential_files` auto-validate at load time

Competitors solve “shared capability deduplication” with 3 extra database tables and a runtime materialization query. Myrm’s GUI-centric architecture eliminates the problem entirely — skills are naturally singletons referenced by ID.

Per-Session Skill Scope

Beyond agent-level scoping, Myrm supports per-session skill selection — choose exactly which skills are active for each conversation:

Visual Toggle: A sparkle icon (✦) in the message input opens a popover listing all enabled skills. Toggle off anything irrelevant to the current task.
Token Savings: Disabling unused skills removes their prompt fragments, saving ~1800 tokens per turn in typical 9-skill setups.
Sharper AI Focus: Fewer candidate skills means less LLM decision noise — the agent picks the right skill faster.
Persistent Override: Your selection is stored server-side (in session_loaded_skill_names), surviving page reloads, context compaction, and conversation forks.
Auto-Clear on Agent Switch: Changing the bound Agent automatically resets the skill scope to the new Agent’s defaults.

vs competitors: LobsterAI offers active_skill_ids with a similar UI. OpenClaw has config-level skillFilter (no GUI). CoPaw supports runtime filtering (no GUI). Hermes, DeerFlow, and jiuwenclaw have no session-level skill scoping at all.

Cross-Device Skill Sync & Privacy

Myrm is the only AI agent platform offering bidirectional skill synchronization with export-time PII redaction — no competitor implements either capability. Sync architecture (Protocol-driven, backend-agnostic):

Pull-first strategy: Remote changes apply before local pushes, preventing accidental overwrites
SHA-256 incremental tracking: Only changed skills transfer — no full-scan overhead
Quality gate: Skills must meet minimum thresholds (e.g., ≥3 executions, ≥70% success rate) before being shared
Background auto-sync: Runs during idle time via the IdleTask system — zero disruption to active work
Plug-and-play backends: Use iCloud Drive, Dropbox, or any NAS as the sync medium

PII auto-redaction on export — 10 categories detected and masked:

API keys (GitHub, Stripe, AWS, SendGrid, HuggingFace, Slack, Replicate, and more)
Environment variables, JSON secret fields, database connection strings
CLI flags, URL parameters, Telegram bot tokens, HTTP Authorization headers
PEM private keys, absolute file paths (with smart system-path exclusion)
Structured diff preview lets users review and selectively override each redaction

Import security: Every imported skill is scanned by SandboxValidator before activation — malicious code never reaches your workspace.

Compared to Alternatives

Capability	Myrm	OpenClaw	Hermes	Generic LLM agents
Integration	Native (42 modules)	Config-based (4 fields)	External CLI wrapper (AGPL)	None
Evolution strategy	4-level (balanced/innovate/harden/repair-only)	On/off toggle	None	None
Approval mechanism	2-phase screener + GUI panel	pending/auto binary	Auto-execute	None
Per-skill lock	Yes (frontmatter + API)	No (global only)	No	None
Safety layers	5	0	0	0
Quarantine	1-Strike/3-Strikes auto-quarantine	None	None	None
Rejection learning	Permanent constraints	None	None	None
Shadow approval	Test before promote	None	None	None
GUI approval	Full dashboard + diff	None	No	No
Evolution cost	3 variants/run	N/A	50-500 LLM calls/run	N/A
Persistence	SQLite + Qdrant	File system	Pickle files	N/A
Frustration detection	38 bilingual patterns	None	None	None
Quality monitoring	3D degradation	None	None	None
Audit dashboard	Pending + Rejection dashboards	None	None	None

vs Hermes Ecosystem Plugins

Hermes requires 5 separate third-party plugins to approximate what Myrm provides natively:

Plugin	What It Does	Myrm Native Equivalent
curator-evolver	Auto-evolution via HTML comment managed blocks	8-stage evolution pipeline with SkillLineage versioning
SkillClaw	Cross-agent skill sync with 3-stage pipeline	Single-product native evolution + cloud sync
CaMeL Guard	Trust boundary security (trusted/untrusted separation)	6-layer onion defense-in-depth
lineworks	LINE WORKS enterprise communication	35+ channel adapters
agent-docker	Minimal Docker packaging	PTC sandbox + Docker + Tauri multi-layer isolation

Key advantages of Myrm’s native approach:

No dependency fragmentation — a single product vs 5 separate repos with different maintainers, licenses, and update cycles
Deeper integration — evolution system talks directly to security, context management, and GUI layers
GUI-first experience — every feature has visual management vs CLI-only tooling
Production-grade safety — 5-layer evolution safety + 6-layer platform security vs ad-hoc checks

Dual-View Review Experience

Myrm’s Skill Growth dashboard supports two view modes, switchable with a single click:

Mode	What You See	Best For
Simple (default)	Natural language summary + confidence indicator (green/amber/red) + test status + large Approve/Reject buttons	Non-technical users, quick triage of multiple proposals
Detailed	Full ReactDiffViewer code diff + Monaco editor for revisions + trajectory analysis	Developers reviewing code-level changes

In Simple mode, each card includes a “View changes” link that expands the diff on demand — no performance penalty until you actually need it. Your preference is persisted in localStorage across sessions.

Growth Center data plane (2026-07)

Settings → Skills → Pending Evolutions uses a three-layer read path so large diffs never slow the list:

Capability	API	What you see
Summary list	`GET /skill-growth/cases`	Fast list without diff/trajectory bodies
Lazy detail	`GET /skill-growth/cases/{id}`	Full diff only when you expand a card
Stats bar	`GET /skill-growth/stats`	SQL-accurate Total/Pending/Auto/Blocked
Filter badges	Same stats source	Badge counts match the full corpus, not just loaded rows
Scope hint	Cases API `total` field	Shows “latest N of M” when M > loaded page

Verified by 10+ pytest + Chrome MCP E2E on real :3000 UI: seed → API asserts → stat cards → filter → refresh → lazy diff. No competitor offers a GUI-based skill review dashboard. OpenClaw uses CLI commands (openclaw skills workshop apply <id>). Other competitors either have no skill evolution system or rely entirely on CLI/text-based approval.

Data Flywheel Dashboard

Myrm provides a 6-panel data flywheel visualization system — the most comprehensive skill analytics in the industry:

Panel	What It Shows	Where
Growth Dashboard	5-tab suite (Overview/Evolution/Graph/Trends/Daily) with 9 professional components: activity heatmap, health radar, KPI cards, cost savings, weekly summary, 2D Claim/Evidence knowledge graph, skill usage efficiency trends, pattern digest, daily journal	`/journey` page (former `/growth` auto-redirects)
Global Skill Quality	Average quality score, execution count, optimization rate, quality distribution chart, trend chart, Top/Worst skill rankings, CSV/JSON export	Settings → AI Tools
Pending Evolutions	Proposals awaiting review in Simple/Detailed dual views with approve/reject/revise actions	Skills page
Evolution Rejections	Blocked proposals with rejection reasons for auditing	Skills page
Permission Usage	Per-skill permission call frequency and usage patterns	Skill detail sheet
Skill Quality Section	Real-time quality monitoring configuration and metrics	Settings → AI Tools

The complete flywheel loop runs automatically: Capture traces → Extract skills → Quality gate → Reject bad proposals → Feedback to improve. No manual configuration required.

​Skill Evolution

​How It Works

​Directed Skill Learning: /learn

​Learning by Demonstration

​Learning from Human Takeover

​Automatic Multi-Step Workflow Learning

​Four Evolution Types

​5-Layer Safety

​Frustration Detection

​Evidence-Driven Evolution

​Smallest Appropriate Form

​Proactive Skill Recommendation

​Quality Monitoring

​Stop-Loss: Versions, Shadow A/B, and Batch Snapshots

​Known Pitfalls: Auto-Learning from Failures

​Decision History: Dual-Loop Learning

​Curator: Lifecycle Governance

​Conversational Skill Management

​Chat Commands

​7-Source Skill Marketplace

​Security-First Installation

​GUI Management (33 Components)

​Skill Library

​Toolset-Aware Skills

​Zero-Roundtrip Skill Injection

​Three-Way Hash Protection

​Skill Config Hot Reload

​6 Concurrent Self-Evolution Mechanisms

​Growth Dashboard

​Daily Work Journal

​Multi-Agent Skill Scoping & Sharing

​Zero-Overhead Global Registry

​Per-Session Skill Scope

​Cross-Device Skill Sync & Privacy

​Compared to Alternatives

​vs Hermes Ecosystem Plugins

​Dual-View Review Experience

​Growth Center data plane (2026-07)

​Data Flywheel Dashboard

Skill Evolution

How It Works

Directed Skill Learning: `/learn`

Learning by Demonstration

Learning from Human Takeover

Automatic Multi-Step Workflow Learning

Four Evolution Types

5-Layer Safety

Frustration Detection

Evidence-Driven Evolution

Smallest Appropriate Form

Proactive Skill Recommendation

Quality Monitoring

Stop-Loss: Versions, Shadow A/B, and Batch Snapshots

Known Pitfalls: Auto-Learning from Failures

Decision History: Dual-Loop Learning

Curator: Lifecycle Governance

Conversational Skill Management

Chat Commands

7-Source Skill Marketplace

Security-First Installation

GUI Management (33 Components)

Skill Library

Toolset-Aware Skills

Zero-Roundtrip Skill Injection

Three-Way Hash Protection

Skill Config Hot Reload

6 Concurrent Self-Evolution Mechanisms

Growth Dashboard

Daily Work Journal

Multi-Agent Skill Scoping & Sharing

Zero-Overhead Global Registry

Per-Session Skill Scope

Cross-Device Skill Sync & Privacy

Compared to Alternatives

vs Hermes Ecosystem Plugins

Dual-View Review Experience

Growth Center data plane (2026-07)

Data Flywheel Dashboard