ClawAudit verdict

self-safety-guard

sx-self-safety-guard

88
🟢 Trusted
Low risk — reviewed by ClawAudit, behavior matches stated purpose

Implements AI self-safety guard; monitors and protects against malicious requests.

⚠ Flagged for review — coarse, uncorroborated signal, not a confirmed exploit. Review the config yourself before installing.

Automated static analysis — not a human review. ClawAudit flags capabilities, not confirmed intent, and can produce false positives. Disagree with this verdict? Use Dispute below.

40
security
90
transparency
90
maintenance

Findings (4)

Pattern match high

Prompt injection — tries to override agent instructions

SKILL.md · prose · downgraded · Ignore previous instructions

Pattern match high

Raw model control tokens — prompt injection via token manipulation

references/prompt-injection-patterns.md · prose · downgraded · [INST]

Pattern match medium

Possible prompt injection — attempts to redefine agent identity

SKILL.md · prose · downgraded · You are now

Pattern match medium

Fake system prompt — attempts to inject instructions

references/prompt-injection-patterns.md · prose · downgraded · System: You are

Permissions & capabilities

No declared permissions — minimal attack surface.

Is this flag fair?

Check another skill Browse the registry Auditing your own skills or configs? Use the API