OpenClawBrain
Canonical paper route

Paper, PDF, and supporting materials.

This is the human-facing home for the current OpenClawBrain paper. It packages the March 2, 2026 PDF with version metadata, the core research framing, and the links that keep the paper aligned with the live proof boundary on the site.

Evidence boundary: the deterministic workflow-proof slice is live and reproducible now. Recorded-session, shadow-mode, and narrow online proof are still future work, so the paper should be read alongside /proof/.

Version v12.2.6+ Jonathan Gu March 2, 2026 Direct PDF: /openclawbrain.pdf
Current artifact

OpenClawBrain v12.2.6+

Read it honestly

Mechanism proof is live. Full product proof is not.

  • The paper covers the hot/cold path split, route signals, confidence mixing, and unified policy learning.
  • /proof/ shows what is actually proven now and what still needs stronger evidence.
  • /blog/v12.2.6-series/ explains how the product story and rollout path fit together.

What the paper covers

The paper is the longer technical framing behind the current site. These are the core claims it tries to organize, without turning mechanism proof into online proof.

System split

Hot path local, cold path asynchronous

The central operating shape is a strict split: local bounded routing on live OpenClaw turns, then replay, labeling, and policy updates later off the hot path.

Route signals

graph_prior plus QTsim confidence mixing

The runtime policy combines durable structure with query-conditioned fit, then uses uncertainty features like entropy and margin to decide how much each signal should matter per decision.

Learning rule

Unified policy learning

Teacher distillation and policy-gradient updates are treated as one learning loop with an authority order: human corrections first, then self-learning outcomes, harvested signals, and async-teacher labels.

Maintenance path

RL-native graph maintenance

The paper also frames graph maintenance as a control problem, with shipped Phase 2a hooks now and future connect, split, and merge actions behind conservative guardrails.

Read next

The paper is easier to evaluate when it is paired with the packaged proof, the current narrative, and the underlying materials.

Canonical human-facing paper route: /paper/. Direct artifact route: /openclawbrain.pdf.