AI agents (Claude Code, LangChain, CrewAI, MCP) all follow the same pattern: agent outputs a structured tool call, client code executes it. That gap between proposed and executed is a natural interception point, and almost nobody is building the control layer that sits in it.
Content guardrails (NeMo, LlamaGuard) control what models say, not what agents do. Agent sandboxes scope directories but don't back anything up. Checkpoint tools provide rollback, but the agent can delete the checkpoints. OPA evaluates policy in microseconds, but nobody has bridged it to AI agent frameworks yet.
Agent Gate sits in that gap. It classifies tool calls against pre-computed policy, enforces directory boundaries, and vault-backs every destructive target to an agent-unreachable location before the action proceeds. If the backup fails, the action is blocked.
Live tested with Claude Code in fully autonomous mode via PreToolUse hooks. 18/18 tests passing. The vault creates per-operation timestamped snapshots, so multiple overwrites of the same file produce separate recovery points.
Background: I spent years in nuclear command and control where Permissive Action Links verified authorization, not judgment, before any action could proceed. Same architectural principle applied here.
Honest about the limitations: the bash parser is naive, shell expansion isn't evaluated, and this is a safety net for well-intentioned agents, not a security boundary against adversarial escape. More detail in the README.
Python, YAML policy definitions, Apache 2.0. Roadmap includes MCP proxy integration and OPA/Rego support.
AI agents (Claude Code, LangChain, CrewAI, MCP) all follow the same pattern: agent outputs a structured tool call, client code executes it. That gap between proposed and executed is a natural interception point, and almost nobody is building the control layer that sits in it.
Content guardrails (NeMo, LlamaGuard) control what models say, not what agents do. Agent sandboxes scope directories but don't back anything up. Checkpoint tools provide rollback, but the agent can delete the checkpoints. OPA evaluates policy in microseconds, but nobody has bridged it to AI agent frameworks yet.
Agent Gate sits in that gap. It classifies tool calls against pre-computed policy, enforces directory boundaries, and vault-backs every destructive target to an agent-unreachable location before the action proceeds. If the backup fails, the action is blocked.
Live tested with Claude Code in fully autonomous mode via PreToolUse hooks. 18/18 tests passing. The vault creates per-operation timestamped snapshots, so multiple overwrites of the same file produce separate recovery points.
Background: I spent years in nuclear command and control where Permissive Action Links verified authorization, not judgment, before any action could proceed. Same architectural principle applied here.
Honest about the limitations: the bash parser is naive, shell expansion isn't evaluated, and this is a safety net for well-intentioned agents, not a security boundary against adversarial escape. More detail in the README.
Python, YAML policy definitions, Apache 2.0. Roadmap includes MCP proxy integration and OPA/Rego support.
Happy to answer questions about the architecture.