Introducing ClawForge

AI assistants are already inside real engineering and operations workflows. Claude Code is writing code. OpenClaw is reaching production environments. MCP servers are exposing tool surfaces that touch source control, ticketing, cloud APIs, and customer data. The deployments are real, the value is real — and the operator model is mostly missing.

The pattern we keep seeing is the same. A team rolls an assistant out to a handful of trusted laptops. It works. Adoption grows. Someone in security asks how policies are enforced once the assistant is on a hundred laptops. Someone in operations asks where the audit trail lives. Someone in engineering asks what happens during an incident if the tool starts doing the wrong thing. The answers tend to be improvised, machine by machine, and rarely satisfying.

ClawForge is the operations layer those teams have been writing on the side — made open and vendor-neutral so it does not have to be rebuilt for every new runtime that arrives.

What ClawForge is

ClawForge is a control plane and operator console for AI agents at work. It centralises policy, audit, and incident response across mixed agent runtimes, and stays opinionated about where enforcement actually has to happen.

There are four layers in the model:

Agents — the assistants themselves: Claude Code and OpenClaw today; OpenAI Agents, LangGraph, MCP servers, Microsoft AGT, and custom enterprise agents on the published roadmap.
Adapters and interception — runtime-specific surfaces that translate ClawForge policy into something the agent's runtime actually understands: SDK hooks, MCP proxying, AGT integration, native runtime hooks.
Governance runtime — local enforcement, audit emission, and heartbeat behavior, sitting close to the agent rather than waiting on a round-trip to the cloud.
Control plane — the operator surface: policy authoring, audit federation, approval queues, and emergency state.

The split matters. Operators want one place to write policy and review behavior. Runtimes need enforcement that does not break when the network does. ClawForge does the operator half centrally and the enforcement half locally, and treats the connection between them as a first-class control loop, not a logging pipe.

Why a control plane, not another security tool

Most AI security products today are detection-shaped: scan prompts, watch outputs, flag risky calls, report incidents. That work is useful, but it is not the operator's missing piece. The missing piece is the operations surface above it — who gets to approve what, where policy lives, how an incident actually moves through the fleet, what evidence exists after the fact.

A control plane is a different posture. It does not try to be the only line of defence at the runtime layer. It tries to make the runtime layer governable across many agents at once, so the work an operator does in one place actually reaches every machine and every session.

That is also why ClawForge stays vendor-neutral. The runtime layer is going to keep changing. The agent that matters next year is unlikely to be the agent that matters this year. An operations layer that only works against one vendor's runtime is a liability the day the team adds a second one.

Where Microsoft AGT fits

A common question is how this overlaps with Microsoft AGT. The short version: it does not.

AGT does per-tool-call enforcement, MCP gateway behavior, and append-only audit for the runtimes it supports — LangChain, AutoGen, CrewAI, Semantic Kernel, OpenAI Agents SDK, Google ADK, and others. That work is good and we are not rebuilding it.

ClawForge sits above AGT for those runtimes. It writes the policy AGT enforces, surfaces AGT's approval hooks into an operator queue, and federates AGT's audit log into the cross-runtime event store. For runtimes outside AGT — Claude Code today, OpenClaw, custom agents — ClawForge handles interception itself, through SDK adapters, runtime hooks, or its own MCP proxy. The operator surface stays the same either way, which is the only thing that lets a single team manage a mixed fleet without learning a new operator model per runtime.

What ships in the first cut

The launch posture is deliberately narrow. It is meant to be inspectable, deployable, and honest about scope.

Claude Code and OpenClaw runtimes — adapter-based interception with policy enforcement and audit emission today.
Self-hosted control plane — the operator console, policy store, audit store, and runtime registry all run in customer infrastructure.
Heartbeat-driven kill switch — emergency posture propagates through the same loop as policy, with a fail-secure default if the control plane goes quiet for too long.
SSO / OIDC — identity ties into the operator surface so org-scoped roles decide who can review, approve, and trigger emergency state.
Queryable audit trail — events from runtimes land in a single store that operators can actually search, not a per-machine log pile.

MCP servers, OpenAI Agents, LangGraph, and AGT-direct integration follow on a published roadmap. The runtime list will grow; the operator surface is meant not to.

Where to go next

If you want to read the technical model, the architecture page walks through the layers and the control flows. The security page covers the trust boundary and disclosure path. The docs are the implementation reference.

If you would rather talk to someone, the design partner program is the path. Teams running real AI agents in production, with real opinions about what the operator model should look like, are who the first cut is being built with.