ClawAudit verdict

self-safety-guard

sx-self-safety-guard

🟢 Trusted

Low risk — reviewed by ClawAudit, behavior matches stated purpose

Implements AI self-safety guard; monitors and protects against malicious requests.

⚠ Flagged for review — coarse, uncorroborated signal, not a confirmed exploit. Review the config yourself before installing.

Automated static analysis — not a human review. ClawAudit flags capabilities, not confirmed intent, and can produce false positives. Disagree with this verdict? Use Dispute below.

security

transparency

maintenance

Findings (4)

Pattern match high

Prompt injection — tries to override agent instructions

SKILL.md · prose · downgraded · Ignore previous instructions

Pattern match high

Raw model control tokens — prompt injection via token manipulation

references/prompt-injection-patterns.md · prose · downgraded · [INST]

Pattern match medium

Possible prompt injection — attempts to redefine agent identity

SKILL.md · prose · downgraded · You are now

Pattern match medium

Fake system prompt — attempts to inject instructions

references/prompt-injection-patterns.md · prose · downgraded · System: You are

Permissions & capabilities

No declared permissions — minimal attack surface.

Is this flag fair?

Check another skill Browse the registry Auditing your own skills or configs? Use the API