Routing, learning, and proof for OpenClawBrain.
This blog explains OpenClawBrain as it actually ships: tight OpenClaw integration, fast local startup, continuous background learning, and a learned runtime route_fn trained with Ultimate Policy Gradient from scanner/harvester labels, async teacher labels, and user feedback. The proof boundary is explicit: simulations prove mechanism, not full product performance.
Read first
The shortest path from first impression to actual rollout and verification.
What is proven now vs what still needs stronger evidence
One operator-facing hub for mechanism proof, the evidence ladder, and the artifact checklist behind each claim.
The current paper, easy to find
The canonical paper route now packages the current PDF, paper metadata, and the supporting links that keep it aligned with the live proof boundary.
OpenClaw integration
Brain-first mode, fail-open wiring, bounded prompt context, and the continuous training loop.
Use the brain first, train it second
Start with a workspace build and daemon query, then add replay, harvest, and maintenance in the background.
One OpenClaw turn, fully traced
Inbound summary, stable chat_id, bounded [BRAIN_CONTEXT], same-turn correction capture, and what only changes after replay and redeploy.
Reproduce-eval command path
The underlying markdown contract for simulations, fixed-session eval, shadow comparisons, and figure generation.
Current series: v12.2.6+
The current series explains the product direction end to end, with the learned runtime router and proof boundary stated explicitly.
Shadow routing + Ultimate Policy Gradient
Why the live path stays local while the learning path collects labels and updates the router asynchronously.
The learned runtime route_fn
How `graph_prior` and query-conditioned signals combine into the policy that actually runs during a live query.
Evaluation: prove it without pretending
What simulations prove, what they do not, and what stronger evidence should look like.
Local-first rollout
Default stack, fast startup, and how to keep replay and teacher labeling off the hot path.
Baselines and the comparison contract
The right baseline set for OpenClawBrain and the artifact discipline required for any claim.
Brain-first OpenClaw integration
What changes when the brain is on, what stays fail-open, and how feedback ties back to the fired route.
Legacy notes
Earlier posts are still useful for release history, but the current positioning is the v12.2.6+ series plus the operator docs.