ClawAudit verdict
agentbench
The skill provides a legitimate method for benchmarking OpenClaw agents. It includes clear instructions for usage and emphasizes security considerations.
⚠ Flagged for review — coarse, uncorroborated signal, not a confirmed exploit. Review the config yourself before installing.
Automated static analysis — not a human review. ClawAudit flags capabilities, not confirmed intent, and can produce false positives. Disagree with this verdict? Use Dispute below.
Findings (4)
Possible hardcoded credential
tasks/multi-step/release-preparation/setup.sh · prose · downgraded · token: 'mock-token
Instructs covert action — may act without user awareness
tasks/error-handling/cascading-failures/setup.sh · prose · downgraded · Silently
Popular HTTP library — network access
tasks/multi-step/meeting-to-tasks/inputs/meeting-notes.txt · prose · downgraded · got
Python os.getenv — reads environment variable
tasks/tool-efficiency/large-codebase-navigation/setup.sh · prose · downgraded · os.getenv(
Permissions & capabilities
Requires 3 system binaries.
Is this flag fair?
Thanks — recorded.