Architecture (Design)¶
openclaw-mem is a local-first context supply chain for OpenClaw.
It does not force-replace OpenClaw’s canonical memory slot/backends. Instead, it: - captures high-signal tool outcomes to durable local storage, - makes recall cheap and auditable, - adds governance (importance, receipts, redaction), - provides clean, minimal context bundles for each request, - and keeps large/raw payloads observable without stuffing them into the prompt.
Status tags used below: DONE / PARTIAL / ROADMAP.
Philosophy (small, but non-negotiable)¶
Agent “self/continuity” is largely a narrative constructed from memory. For systems, that continuity should be treated as a designed interface: allow narrative, but anchor it in auditable evidence (provenance + trust tiers + citations + receipts) so we don’t amplify confident self-deception or poisoned inputs.
Operating doctrine¶
Treat the product as three cooperating surfaces:
- Store: capture, ingest, local recall, optional mem-engine slot ownership.
- Pack: bounded
ContextPackassembly with citations and trace receipts. - Observe: artifacts, receipts, and local files that make rollback and audits honest.
Recall quality alone is not enough. If a result cannot be packed into a bounded, explainable bundle, the runtime still drowns.
Non-goals¶
- Forcing
openclaw-memitself to become the global “memory core” for OpenClaw (slot ownership remains optional via a separate backend plugin; seedocs/mem-engine.md). - Auto-writing over operator-authored fields.
- Forcing embeddings/LLM dependence (LLM-assisted features must be opt-in).
Data flow (today) — DONE¶
OpenClaw tool results
→ JSONL capture
→ harvest/ingest
→ SQLite ledger (FTS + structured columns)
→ progressive recall (search → timeline → get)
Key artifacts¶
- Raw observations JSONL: append-only “what just happened” events (tool outcomes).
- SQLite DB ledger: curated, queryable memory with audit metadata.
- (Planned) Observational log: a compact, timestamped derived layer (“observations about observations”) designed to be:
- stable-prefix / cache-friendly,
- importance-scored (log-levels),
- easy to diff + debug (text-first).
- Archive + index: optional rotation + stable pointers for operators.
Modules¶
1) Capture — PARTIAL¶
- Source: OpenClaw tool results (and optionally message events).
- Output: JSONL observations (append-only).
- Design constraints:
- safe-by-default redaction
- bounded payloads
- stable schemas (upgrade-safe)
2) Ingest / Harvest — DONE¶
- Converts JSONL observations into a queryable SQLite ledger.
- Must be fail-open: a broken scorer or malformed record must not break ingest.
- Must be non-destructive: never overwrite existing operator fields.
3) Importance grading (MVP v1) — PARTIAL¶
- Goal: governance, not “smart recall”.
- Deterministic scorer (
heuristic-v1) fills missingdetail_json.importance. - Thresholds (current MVP v1):
>= 0.80must_remember>= 0.50nice_to_have- else ignore
- Unset/legacy importance is treated as unknown (do not auto-filter by default).
4) Lifecycle manager (retain / decay / archive / revive) — ROADMAP¶
Goal: keep memory high-signal over long horizons via use-based retention.
Core mechanism:
- Each durable record maintains a reference timestamp (last_used_at) updated only when the record is actually used.
- Default “used” signal (cheap + auditable): the record is selected into the final Context Packer bundle (cited as recordRef).
- A scheduled lifecycle job applies soft archive (reversible) instead of hard delete.
Design notes:
- Do not count bulk preload as “use” (otherwise everything stays forever).
- Consider tracking two signals later:
- last_retrieved_at (candidate hit)
- last_included_at (final bundle inclusion; default)
- Decay policy should be tiered (example): P0 never auto-archive, P1 90d, P2 30d — but thresholds are tunable.
- Trust is independent: frequent use does not promote untrusted → trusted.
Receipts:
- pack --trace should report which records were “refreshed” this run.
- Lifecycle job should emit an aggregate-only summary (archived counts by tier/trust/importance + reasons).
5) Context Packer — ROADMAP¶
Problem: multi-project operation tends to send too much irrelevant context to the LLM.
Goal: given a request, produce a small, clean context bundle that is: - relevant to the request, - biased toward high-importance durable facts, - auditable (includes citations), - and cheap (bounded size).
Proposed input signals: - request text + optional scope tag - recent session snippets (hot) - SQLite facts/tasks (warm) - (optional) graph neighborhood (see next section)
Proposed output (to the LLM): - 5–15 lines of “relevant state” - up to N short summaries (not raw logs) - 1–3 citations (record IDs / URLs), no private paths
Balanced recall (future policy hook): - when meaningful contradictory evidence exists, packing should be able to preserve a tiny explicit counterexample / dissent slot instead of returning only the highest-scoring supporting rows - this is a selection-policy refinement, not a new storage substrate
Preferred encoding (hybrid): - bundle_text for direct injection (human/LLM readable) - a shallow ContextPack JSON object for deterministic anchors (keys/arrays, provenance)
See: Context packing (ContextPack) →
Layered context contract (L0/L1/L2) — design hook¶
Borrow the pattern (not the implementation) of layered loading:
- L0 (abstract): 1 line, cheap filtering ("what is this?")
- L1 (overview): short summary + navigation hints ("what’s inside + where to look next")
- L2 (detail): raw record / full tool output / original artifact
The packer should prefer: 1) retrieve/filter by L0/L1, 2) include mostly L1 in the bundle, 3) only pull L2 when strictly necessary (and still bounded + redaction-safe).
This keeps bundles small, reduces token noise, and makes results easier to debug.
Fresh-tail protection (design hook):
- Keep a small protected tail of the most recent raw turns (continuity).
- Treat older packed items/summaries as an evictable prefix under strict budgets.
- This assembly policy is inspired by LCM-style context engines, but can be applied even when openclaw-mem stays a sidecar.
Verbatim semantic lane (episodic evidence) — DONE (first production slice)¶
A bounded verbatim semantic lane now exists for episodic recall.
It is deliberately narrow:
- corpus = episodic_events.search_text (redacted/bounded)
- read path = episodes search --mode lexical|hybrid|vector
- maintenance path = episodes embed
- role = retrieve raw evidence without changing durable-memory truth
Key boundary: this is a retrieval lane, not a new memory type.
That means: - durable memory may cite it as evidence, but should not auto-promote hits into truth - working set may later consume it, but should not become its source corpus - docs cold lane remains a separate substrate with its own retrieval behavior
Retrieval trajectory receipts (trace) — non-negotiable for ops¶
Packing must be observable. Every pack run should be able to emit a trace that answers:
- what lanes were searched (hot/warm/cold; and "resources/memory/skills" when applicable)
- which candidates were considered
- why each item was included/excluded (importance, trust tier, score, cap, recency, scope)
Proposed interface (draft):
- openclaw-mem pack --query "..." --budget-tokens <n> --json --trace
- --json returns the bundle
- --trace returns a machine-readable receipt for audits/debugging (redaction-safe; no raw content)
- Receipt should include per-candidate include/exclude rationale and a stable recordRef (e.g. obs:123) citation trail.
Observational-memory mode (derived, cache-friendly)¶
A promising variant of Context Packer is to keep a stable two-block context window:
1) OBSERVATIONS: a compact, timestamped, importance-scored observation log (text-first). 2) RAW BUFFER: the most recent uncompressed turns.
An “observer” process periodically compresses RAW BUFFER → OBSERVATIONS once the buffer crosses a size threshold; an infrequent “reflector” prunes low-value observations.
This structure is designed to keep the prompt prefix stable (better caching) while still allowing continuous operation.
See also: Thought-links →
This module is the bridge between “memory governance” and “prompt cleanliness”.
Forward-looking schema note: - today the system already distinguishes episodic capture and stable durable records such as facts/preferences/decisions. - a likely next hardening step is an explicit prospective / plan-shaped durable kind so future-triggered intent does not masquerade as a fact.
6) Graph semantic memory — ROADMAP¶
Goal: support idea → project matching and path-justified recommendations.
We represent durable knowledge as typed entities and edges, e.g.:
- entities: Project, Repo, Decision, Concept, Tool, Person (redacted)
- edges: USES, BLOCKED_BY, RELATES_TO, EVALUATES, DECIDES, MENTIONS
Key requirements: - local-first - typed edges + traversal - path justification in outputs (why a recommendation was made)
Storage posture (today): - prefer portable / derived graph artifacts first - SQLite adjacency / receipts / optional Markdown materialization are enough for v0/v1 - defer dedicated graph-store evaluation until compiled synthesis + query quality prove the need
The graph is optional: the Context Packer should degrade gracefully without it.
Temporal fact materialized view — SHIPPED v1.9.26¶
The graph-facing current-truth lane is not a graph DB migration or a wiki as truth. It is a derived temporal fact view:
Store sources -> temporal fact view -> Pack output -> Observe receipts
The view answers operator questions such as "what is currently true about this subject?", "when did that change?", and "which receipts support it?" while preserving the existing ownership split:
- Store owns source evidence.
- Pack owns bounded context assembly.
- Observe owns receipts and auditability.
- The temporal fact view is rebuildable, linted, and never authoritative when its sources are missing or stale.
V0 is explicit-assertion only: controlled predicates, single-subject current
truth, timeline, lint, and ContextPack-compatible fact packing. Automatic
extraction apply and multi-hop inference are later, gated work. The shipped
extraction surface is review-only and reports writes_performed=false.
Public docs: Temporal facts Source spec: Temporal fact materialized view v0
Integration with OpenClaw memory backends¶
OpenClaw’s memory backend (e.g. memory-lancedb) and openclaw-mem solve different problems:
- backend auto-recall: semantic relevance right now
- openclaw-mem: durable governance + auditable receipts + operator workflows
Recommendation (today)¶
- Keep OpenClaw memory backend as canonical.
- Use
openclaw-memfor capture + harvest + importance + triage.
Recommendation (future, after Context Packer exists)¶
- Consider disabling backend autoRecall and instead feed the agent a small, deterministic
openclaw-mem packbundle per request. - This improves noise control and reduces token waste, but requires explicit integration (hook/tool) so recall doesn’t disappear.
Research tracks (delegate to GitHub scouting)¶
- Compiled synthesis card lifecycle (selection / stale detection / lint thresholds)
- Extraction strategies for entities/edges (heuristic first; LLM optional)
- Prompt cleanliness patterns: relevance filtering, bounded summaries, citations
- Recall-balance policies: contradiction detection, counterexample quotas, and when dissent should be reserved in a pack bundle
Related guarded-adoption note:
- docs/archive/notes/2026-03-17_tradememory-guarded-adoption.md