Architecture (Design)¶
openclaw-mem is a local-first memory sidecar for OpenClaw.
It does not force-replace OpenClaw’s canonical memory slot/backends. Instead, it: - captures high-signal tool outcomes to durable local storage, - makes recall cheap and auditable, - adds governance (importance, receipts, redaction), - and (planned) provides clean, minimal context bundles for each request.
Status tags used below: DONE / PARTIAL / ROADMAP.
Philosophy (small, but non-negotiable)¶
Agent “self/continuity” is largely a narrative constructed from memory. For systems, that continuity should be treated as a designed interface: allow narrative, but anchor it in auditable evidence (provenance + trust tiers + citations + receipts) so we don’t amplify confident self-deception or poisoned inputs.
Non-goals¶
- Forcing
openclaw-memitself to become the global “memory core” for OpenClaw (slot ownership remains optional via a separate backend plugin; seedocs/mem-engine.md). - Auto-writing over operator-authored fields.
- Forcing embeddings/LLM dependence (LLM-assisted features must be opt-in).
Data flow (today) — DONE¶
OpenClaw tool results
→ JSONL capture
→ harvest/ingest
→ SQLite ledger (FTS + structured columns)
→ progressive recall (search → timeline → get)
Key artifacts¶
- Raw observations JSONL: append-only “what just happened” events (tool outcomes).
- SQLite DB ledger: curated, queryable memory with audit metadata.
- (Planned) Observational log: a compact, timestamped derived layer (“observations about observations”) designed to be:
- stable-prefix / cache-friendly,
- importance-scored (log-levels),
- easy to diff + debug (text-first).
- Archive + index: optional rotation + stable pointers for operators.
Modules¶
1) Capture — PARTIAL¶
- Source: OpenClaw tool results (and optionally message events).
- Output: JSONL observations (append-only).
- Design constraints:
- safe-by-default redaction
- bounded payloads
- stable schemas (upgrade-safe)
2) Ingest / Harvest — DONE¶
- Converts JSONL observations into a queryable SQLite ledger.
- Must be fail-open: a broken scorer or malformed record must not break ingest.
- Must be non-destructive: never overwrite existing operator fields.
3) Importance grading (MVP v1) — PARTIAL¶
- Goal: governance, not “smart recall”.
- Deterministic scorer (
heuristic-v1) fills missingdetail_json.importance. - Thresholds (current MVP v1):
>= 0.80must_remember>= 0.50nice_to_have- else ignore
- Unset/legacy importance is treated as unknown (do not auto-filter by default).
4) Lifecycle manager (retain / decay / archive / revive) — ROADMAP¶
Goal: keep memory high-signal over long horizons via use-based retention.
Core mechanism:
- Each durable record maintains a reference timestamp (last_used_at) updated only when the record is actually used.
- Default “used” signal (cheap + auditable): the record is selected into the final Context Packer bundle (cited as recordRef).
- A scheduled lifecycle job applies soft archive (reversible) instead of hard delete.
Design notes:
- Do not count bulk preload as “use” (otherwise everything stays forever).
- Consider tracking two signals later:
- last_retrieved_at (candidate hit)
- last_included_at (final bundle inclusion; default)
- Decay policy should be tiered (example): P0 never auto-archive, P1 90d, P2 30d — but thresholds are tunable.
- Trust is independent: frequent use does not promote untrusted → trusted.
Receipts:
- pack --trace should report which records were “refreshed” this run.
- Lifecycle job should emit an aggregate-only summary (archived counts by tier/trust/importance + reasons).
5) Context Packer — ROADMAP¶
Problem: multi-project operation tends to send too much irrelevant context to the LLM.
Goal: given a request, produce a small, clean context bundle that is: - relevant to the request, - biased toward high-importance durable facts, - auditable (includes citations), - and cheap (bounded size).
Proposed input signals: - request text + optional scope tag - recent session snippets (hot) - SQLite facts/tasks (warm) - (optional) graph neighborhood (see next section)
Proposed output (to the LLM): - 5–15 lines of “relevant state” - up to N short summaries (not raw logs) - 1–3 citations (record IDs / URLs), no private paths
Balanced recall (future policy hook): - when meaningful contradictory evidence exists, packing should be able to preserve a tiny explicit counterexample / dissent slot instead of returning only the highest-scoring supporting rows - this is a selection-policy refinement, not a new storage substrate
Preferred encoding (hybrid): - bundle_text for direct injection (human/LLM readable) - a shallow ContextPack JSON object for deterministic anchors (keys/arrays, provenance)
See: Context packing (ContextPack) →
Layered context contract (L0/L1/L2) — design hook¶
Borrow the pattern (not the implementation) of layered loading:
- L0 (abstract): 1 line, cheap filtering ("what is this?")
- L1 (overview): short summary + navigation hints ("what’s inside + where to look next")
- L2 (detail): raw record / full tool output / original artifact
The packer should prefer: 1) retrieve/filter by L0/L1, 2) include mostly L1 in the bundle, 3) only pull L2 when strictly necessary (and still bounded + redaction-safe).
This keeps bundles small, reduces token noise, and makes results easier to debug.
Fresh-tail protection (design hook):
- Keep a small protected tail of the most recent raw turns (continuity).
- Treat older packed items/summaries as an evictable prefix under strict budgets.
- This assembly policy is inspired by LCM-style context engines, but can be applied even when openclaw-mem stays a sidecar.
Retrieval trajectory receipts (trace) — non-negotiable for ops¶
Packing must be observable. Every pack run should be able to emit a trace that answers:
- what lanes were searched (hot/warm/cold; and "resources/memory/skills" when applicable)
- which candidates were considered
- why each item was included/excluded (importance, trust tier, score, cap, recency, scope)
Proposed interface (draft):
- openclaw-mem pack --query "..." --budget-tokens <n> --json --trace
- --json returns the bundle
- --trace returns a machine-readable receipt for audits/debugging (redaction-safe; no raw content)
- Receipt should include per-candidate include/exclude rationale and a stable recordRef (e.g. obs:123) citation trail.
Observational-memory mode (derived, cache-friendly)¶
A promising variant of Context Packer is to keep a stable two-block context window:
1) OBSERVATIONS: a compact, timestamped, importance-scored observation log (text-first). 2) RAW BUFFER: the most recent uncompressed turns.
An “observer” process periodically compresses RAW BUFFER → OBSERVATIONS once the buffer crosses a size threshold; an infrequent “reflector” prunes low-value observations.
This structure is designed to keep the prompt prefix stable (better caching) while still allowing continuous operation.
See also: Thought-links →
This module is the bridge between “memory governance” and “prompt cleanliness”.
Forward-looking schema note: - today the system already distinguishes episodic capture and stable durable records such as facts/preferences/decisions. - a likely next hardening step is an explicit prospective / plan-shaped durable kind so future-triggered intent does not masquerade as a fact.
6) Graph semantic memory — ROADMAP¶
Goal: support idea → project matching and path-justified recommendations.
We represent durable knowledge as typed entities and edges, e.g.:
- entities: Project, Repo, Decision, Concept, Tool, Person (redacted)
- edges: USES, BLOCKED_BY, RELATES_TO, EVALUATES, DECIDES, MENTIONS
Key requirements: - local-first - typed edges + traversal - path justification in outputs (why a recommendation was made)
Storage options (behind an interface): - Kuzu (fast + typed graph; evaluate longevity risk) - SQLite adjacency lists (simpler; less expressive) - other graph stores (future)
The graph is optional: the Context Packer should degrade gracefully without it.
Integration with OpenClaw memory backends¶
OpenClaw’s memory backend (e.g. memory-lancedb) and openclaw-mem solve different problems:
- backend auto-recall: semantic relevance right now
- openclaw-mem: durable governance + auditable receipts + operator workflows
Recommendation (today)¶
- Keep OpenClaw memory backend as canonical.
- Use
openclaw-memfor capture + harvest + importance + triage.
Recommendation (future, after Context Packer exists)¶
- Consider disabling backend autoRecall and instead feed the agent a small, deterministic
openclaw-mem packbundle per request. - This improves noise control and reduces token waste, but requires explicit integration (hook/tool) so recall doesn’t disappear.
Research tracks (delegate to GitHub scouting)¶
- Graph store choice & longevity (Kuzu and alternatives)
- Extraction strategies for entities/edges (heuristic first; LLM optional)
- Prompt cleanliness patterns: relevance filtering, bounded summaries, citations
- Recall-balance policies: contradiction detection, counterexample quotas, and when dissent should be reserved in a pack bundle
Related guarded-adoption note:
- docs/archive/notes/2026-03-17_tradememory-guarded-adoption.md