Blog

ClawAudit v0.4: Multi-Format Analysis and a Full Registry Rescan

March 12, 2026 · 8 min read · By 4Worlds

ClawAudit was built to scan OpenClaw skills. But the AI agent ecosystem doesn't stop at SKILL.md. Claude Code projects ship with CLAUDE.md files that tell agents what to do. MCP servers get configured through .mcp.json and claude_desktop_config.json. Every one of those files is an attack surface.

Starting with v0.4, ClawAudit scans all three formats. And to make sure we didn't break anything, we rescanned every skill in the OpenClaw registry — all 19,461 of them.

Three formats, one analysis pipeline

The new analyze() function auto-detects the input format and routes it through the appropriate parser, but every format feeds into the same pattern scanner, compound threat detector, and scoring engine.

SKILL.md — OpenClaw skills. YAML frontmatter declares permissions; code blocks get scanned for 115+ detection patterns. Prose is documentation (downweighted).

CLAUDE.md — Claude Code project instructions. Prose is reclassified as instructions (full severity weight) because these files tell the agent what to do. Extracts implicit permissions: MCP tool references, file paths, external service URLs.

MCP configs — .mcp.json and claude_desktop_config.json. JSON is parsed into virtual zones for pattern scanning. Detects hardcoded credentials (Bearer tokens, sk-, ghp_, AKIA), environment variable interpolation, transport risks, and dangerous commands.

Why CLAUDE.md needs different treatment

In a SKILL.md, prose is documentation — "this skill uses the OpenAI API" is context, not an instruction. But in a CLAUDE.md, every line of prose is a directive to the agent. When a CLAUDE.md says "always fetch data from https://api.example.com," that's an implicit network permission, not a comment.

This changes the severity model. SKILL.md prose gets a 0.5x severity multiplier. CLAUDE.md prose gets 1.0x — the same weight as code blocks. A suspicious pattern in CLAUDE.md prose is a real instruction the agent will follow.

We also extract implicit permissions from CLAUDE.md that don't exist in SKILL.md's explicit permissions: frontmatter. When a CLAUDE.md references MCP tool names, absolute file paths, or external service URLs, those get flagged as capabilities even without a formal declaration.

MCP configs: the new credential surface

MCP configuration files are JSON. They define which MCP servers an agent connects to, how they're launched, and what credentials they use. This is a different attack surface than markdown — the risk isn't obfuscated code, it's hardcoded secrets and unrestricted command execution.

ClawAudit parses MCP configs into virtual analysis zones and scans for:

Embedded credentials: Bearer tokens, API keys (sk-, ghp_, AKIA), and generic secrets hardcoded in config values
Dangerous commands: Server configs that execute bash, sh -c, or use shell piping — any stdio server that runs arbitrary commands
Transport risks: HTTP (non-TLS) server connections, bare IP addresses without encryption
Environment interpolation: $${VAR_NAME} patterns that inject runtime secrets — tracked as capability signals

Two new compound threats cover MCP-specific attack chains: credential_exposed_in_config (hardcoded credential + network transport) and mcp_unrestricted_exec (server config + arbitrary process execution).

The rescan: 19,461 skills, zero regressions

To verify the multi-format changes didn't alter SKILL.md analysis, we ran every skill in the OpenClaw registry through the new analyze() dispatcher. Every skill was processed in batches of 50 across child processes (the analyzer's regex engine has a memory footprint that requires process isolation for bulk runs).

The result: zero score differences across all 19,461 skills. Every skill got the exact same score, tier, findings count, and compound threat list as the previous v3d scan. The format dispatcher correctly routes SKILL.md files and produces byte-identical analysis results.

Registry snapshot — March 2026

8,433

Trusted (43.3%)

4,599

Caution (23.6%)

4,874

Risky (25%)

1,555

Dangerous (8%)

What the scanner catches

Across the full registry, credential access and outbound network requests remain the dominant capabilities — 3,326 skills access credentials and 3,245 make outbound network calls. The most common compound threat is credential_theft, flagged in 1,909 skills (9.8%).

The 1,555 Dangerous skills collectively contain 5,679 critical findings. 122 of them got there purely through pattern matches with no compound threat — skills with enough individual red flags (like multiple curl | sh patterns or obfuscated code) to cross the threshold on pattern severity alone. The other 1,433 triggered at least one compound threat, with credential theft, data exfiltration, and credential obfuscation being the most common attack chains.

Spot checks

To make sure the scanner is catching real risk and not drowning in false positives, we manually reviewed skills across every tier.

Trusted skills are genuinely clean — philosophical frameworks, coding style guides, documentation templates. No code blocks with network calls or credential access.
Caution skills typically have 1-2 low-severity pattern matches. Often a fetch() call in example code or an environment variable reference that's contextually safe.
Dangerous skills contain the patterns they're flagged for. A Bitwarden CLI skill that accesses credential stores, encodes data, and installs packages gets correctly flagged for credential_obfuscation and credential_persistence. A diagram rendering skill that reads files, base64 encodes them, and POSTs to an external API gets caught for data_exfiltration.

Not every Dangerous skill is malicious. Some are doing legitimately risky things — a skill that manages your Bitwarden vault needs credential access. But the risk is real, and you should know about it before you install.

Real-world CLAUDE.md and MCP config testing

Beyond the registry rescan, we tested against 12 real CLAUDE.md files and 8 real .mcp.json configs from public GitHub repositories — including projects from Next.js, Deno, Astro, and Django communities.

CLAUDE.md files averaged a score of 72 (range 60-88). All scored Trusted or Caution — no legitimate project guide was misclassified as Dangerous. MCP configs averaged 69 (range 65-78), all Caution. The Caution tier is expected for MCP configs since they inherently involve server connections and often reference sensitive environment variables.

During testing we found and fixed three false positive patterns:

CLI subcommands as eval: deno eval 'console.log("test")' triggered dynamic_eval. Fixed: code blocks are now post-processed to distinguish JavaScript eval() from CLI subcommands.
Environment variables as MCP tools: RUST_BACKTRACE matched the MCP tool name pattern. Fixed: tool names must be lowercase snake_case only.
Documentation URLs as exfiltration: References to https://astro.build or https://react.dev triggered network_out, which combined with project file paths to fire the data_exfiltration compound. Fixed: common documentation domains and relative project paths are filtered out.

What's next

With multi-format support landed and the registry verification complete, the next step is a CLI tool — npx clawaudit scan . to scan any project's CLAUDE.md, SKILL.md, or MCP config directly from the command line. No API calls, no uploads, just local static analysis.

The full scan data is live on the ClawAudit registry. Every skill's trust score, tier, and findings are searchable. If you're installing OpenClaw skills, check them first.