ClawAudit verdict
evaluate-skill
The skill can execute arbitrary code, posing a risk if not properly validated.
โ Flagged for review โ coarse, uncorroborated signal, not a confirmed exploit. Review the config yourself before installing.
Automated static analysis โ not a human review. ClawAudit flags capabilities, not confirmed intent, and can produce false positives. Disagree with this verdict? Use Dispute below.
Findings (3)
Uses eval() โ can execute arbitrary code
REFERENCE.md ยท prose ยท downgraded ยท eval (
Recursive delete from root or home โ destructive command
references/evals/commit-simple/commit-simple.eval.yaml ยท prose ยท downgraded ยท rm -rf /
subprocess execution โ runs system commands from Python
evaluate-skill.eval.yaml ยท prose ยท downgraded ยท subprocess.run(
Why the tier is capped
Execution sink present in raw bytes (Hard Floor: class B/D/F). Final tier capped at Caution โ cannot be lifted by any downgrade, example-payload opt-in, or allowlist.
Permissions & capabilities
No declared permissions โ minimal attack surface.
dynamic_eval Is this flag fair?
Thanks โ recorded.