ClawAudit verdict

agentbench

88
🟢 Trusted
Low risk — reviewed by ClawAudit, behavior matches stated purpose

The skill provides a legitimate method for benchmarking OpenClaw agents. It includes clear instructions for usage and emphasizes security considerations.

⚠ Flagged for review — coarse, uncorroborated signal, not a confirmed exploit. Review the config yourself before installing.

Automated static analysis — not a human review. ClawAudit flags capabilities, not confirmed intent, and can produce false positives. Disagree with this verdict? Use Dispute below.

65
security
90
transparency
70
maintenance

Findings (4)

Pattern match high

Possible hardcoded credential

tasks/multi-step/release-preparation/setup.sh · prose · downgraded · token: 'mock-token

Pattern match medium

Instructs covert action — may act without user awareness

tasks/error-handling/cascading-failures/setup.sh · prose · downgraded · Silently

Pattern match low

Popular HTTP library — network access

tasks/multi-step/meeting-to-tasks/inputs/meeting-notes.txt · prose · downgraded · got

Pattern match low

Python os.getenv — reads environment variable

tasks/tool-efficiency/large-codebase-navigation/setup.sh · prose · downgraded · os.getenv(

Permissions & capabilities

Requires 3 system binaries.

Is this flag fair?

Check another skill Browse the registry Auditing your own skills or configs? Use the API