Ultimate guide

Why useful agent memory is about judgment, not recall.

OpenClawBrain began as a graph-first memory idea. The hard lesson was that retrieval is not enough. The system became a memory authority layer: local evidence, scoped memories, learned routing, stale-memory handling, and proof that explains why memory was used or kept quiet.

Read Markdown source Open live visual Install v0.2.33

Memory quality is judgment first Continuity across tools Memory Authority Codex-to-Telegram handoffs Proof and rollback

OpenClawBrain luminous evidence graph hero

Current0.2.33source tag v0.2.33

Runtimev3 firstv2 rollback path

StorageSQLitegraph + evidence

Public-safe view: raw private transcripts and local paths are omitted; examples are redacted or synthetic.

Core claim

Memory quality is an authority problem before it is a storage problem.

The graph matters. Search matters. But the product behavior comes from deciding whether remembered context still has permission to guide the current turn.

An agent should remember what matters, respect what changed, and know when the current user instruction beats the past.

What it is

A local learning loop around an AI agent.

OpenClawBrain records scoped evidence, builds a local graph, learns compact route policies, and injects only the memories that clear routing, relevance, and safety gates.

Local evidence graph

SQLite stores memory nodes, typed edges, full-text search, injection rows, route decisions, proof events, and route-policy-v3 learning tables.

Learned route function

The active route_fn decides whether memory should be searched, injected, omitted, or abstained from for the current turn.

Proof-first operations

Captures, rejections, retrievals, injections, teacher critiques, candidate reports, and rollbacks leave inspectable evidence.

Hard-earned evolution

The architecture changed because the first versions taught the wrong lessons.

The repo history moved from Python simulation and eval scaffolding into a native OpenClaw plugin, then into SQLite graph memory, route learning, and route-policy-v3 production serving.

0Python mechanism proof

Brain Ground Zero tested graph memory, route functions, recurring workflows, relational drift, sparse feedback, and RAG baselines.

1V5 evidence pipeline

Trace admission, judges, ledgers, thresholds, and result pages made evidence rigorous, but not yet installable.

2Native plugin and v0.1

OpenClaw hooks, status routes, and static injection proved the runtime path. A classifier plus notes was not enough.

3v0.2 SQLite graph

Memory nodes, edges, FTS, route decisions, injection rows, audits, and proof events gave the system a durable substrate.

4route-policy-v3

Shadow decisions, replay, calibration, candidate reports, champion/challenger promotion, abstention, and rollback became production routing.

Core loop

The v3 loop separates runtime serving from slower learning.

Prompt-time behavior stays compact. Heavier semantic work happens after the turn, where it can be audited, replayed, and promoted without making every prompt slower.

What failed

The rejected mental models are the real guide.

OpenClawBrain improved when the design stopped treating memory as a bigger notes file and started treating it as a gated, measured route decision.

The graph is the product

Wrong shape. The graph is the evidence substrate. The route function creates product behavior by deciding when the graph should matter.

More context is safer

Wrong incentive. More context can distract, stale, slow, or conflict. The best memory injection is tiny and timed.

Capture every useful thing

Too loose. Automatic capture needs source authority, scope, redaction, dedupe, proof rows, and rejection records.

The LLM can own memory

Too risky. The LLM proposes meaning; code validates, scopes, stores, replays, promotes, and rolls back.

A clean plan is enough

Not in production. The system only advanced when each plan produced a route, table, proof, test, package, install, endpoint, or live page.

A benchmark ships the product

No. Simulation proved the mechanism. The native plugin still had to solve hooks, latency, packaging, scanning, install paths, and inspection.

Storage map

SQLite stores the graph and the evidence needed to debug it.

The local database is not a hosted graph service. It is an inspectable runtime file with tables for memory, search, route decisions, proof, and learning.

Plane	Artifacts	Why it exists
Memory graph	`memory_nodes`, `memory_edges`, `memory_search`	Scoped facts, corrections, workflows, supersession, typed relationships, and FTS retrieval.
Serving proof	`memory_injections`, `route_decisions`, `proof_events`	Records what was retrieved, injected, omitted, accepted, rejected, or abstained from.
Distillation	`distillation_runs`, audit rows, validated operations	Keeps LLM proposals separate from code-owned memory writes.
v3 learning	route frames, shadow decisions, calibration examples, eval cases, family stats, candidate reports	Lets candidate policies learn from positive examples, negative examples, and correct silence before promotion.
Rollback	policy snapshots, lineage, fallback status	Allows v3 to serve first while v2 and heuristics remain rollback paths.

A useful turn

The visible memory should be small. The evidence behind it should be rich.

A durable correction should become a scoped memory, not a transcript paste. Later it should appear only when the route function expects it to help.

User correction

Actually, use pnpm in this repo.

OpenClawBrain detects a high-confidence user correction, redacts and scopes it to the project, stores a memory node, updates FTS, and records proof.

Later prompt-time injection

<openclawbrain_context>
Relevant memory:
- Must follow: Use pnpm instead of npm in this repo.
</openclawbrain_context>

The user sees only the useful part. The system keeps route decisions, selected IDs, omitted IDs, proof rows, outcome resolution, and learning examples.

Latency model

Prompt-time memory stays boring on purpose.

Memory should not tax every turn. The route function keeps serving compact while background work handles deeper labeling, replay, and promotion.

Tier 0

Local route decision only. No model call. Often the answer is no memory.

Tier 1

Cached route plus SQLite search. Still local and bounded.

Tier 2

One limited planner call for ambiguous high-signal turns.

Tier 3

Background distillation, teacher critique, replay, calibration, and promotion.

Fallback

v3 serves first, v2 can roll back, heuristics are last resort.

Abstention

Correct silence is a production behavior, not a crash path.

Lessons for AI builders

How to ask agents to build systems like this.

The project made progress when prompts demanded evidence surfaces, install-path proof, and explicit failure handling instead of abstract smartness.

Ask for the first proof row

Do not start with "ultimate memory." Start with the first hook, route, schema, proof row, test, install, and live verification.

Preserve one invariant

For OpenClawBrain: LLM decides semantic meaning; code enforces trust boundaries; SQLite stores graph and evidence.

Request failure tables

Name over-capture, under-capture, stale retrieval, scope leak, prompt pollution, latency, scan, install, and rollback failure.

Separate public claim from ambition

The public claim should lag the private dream until install, proof, and rollback are real.

Force repo-history honesty

Use git log, tags, closeouts, tests, package versions, and live endpoints. Separate what shipped from what was planned.

Define done as live proof

Done means tests pass, package works, temp install succeeds, runtime loads, site deploys, and live URLs contain the new copy.

Operator truth

Current public status is deliberately narrow.

OpenClawBrain is published as package openclawbrain, version 0.2.33, with route-policy-v3 as the production route brain, Memory Authority as the resolver between retrieval and injection, Memory Graph Maintenance as the long-term curator, and an OpenClawBrain-owned Codex Telegram bridge for recent messages, watches, handoffs, trusted bound-thread replies, and active-turn steering.

0.2.33 upgrade: Codex UI remains the high-bandwidth workbench, while OpenClaw/Telegram becomes the mobile operator surface. OpenClawBrain can copy recent Codex messages directly, tail watched assistant replies, bind a Telegram chat to an exact thread, send trusted replies, and steer active work through Codex app-server without modifying OpenClaw core.

openclaw plugins install clawhub:openclawbrain@0.2.33 --force
openclaw plugins enable openclawbrain
openclaw gateway restart

openclaw plugins inspect openclawbrain --runtime
openclaw doctor

/brain graph health
/brain graph dry-run
/brain graph proposals

Read graph maintenance Verify proof surfaces Markdown source