specs) — Hybrid search v0 (no local LLM)¶

Goal: make operator-authored repo docs (especially DECISIONS, roadmaps, and specs) reliably retrievable without prompt-hints, using hybrid retrieval (FTS, BM25-scored + embeddings) and auditable receipts.

Terminology note: - FTS = full-text search (lexical / keyword retrieval). - BM25 = a relevance scoring/ranking function commonly used inside FTS systems. - When we say "FTS/BM25" in this repo, we mean FTS with BM25-style scoring, not a separate engine.

This spec is intentionally LLM-minimal: no local query-expansion models and no local rerank models. If reranking is needed later, we use a remote API over a bounded candidate set.

Problem¶

We frequently “do the right thing” operationally (write decisions, commit & push), but recall still fails:

The agent’s memory backend often can’t retrieve a decision unless CK supplies a specific keyword.
Pushing to git does not automatically mean it becomes searchable to the agent.
Vector-only recall is brittle on:
dates / commit SHAs / paths
short identifiers / error codes
partial quotes

Non-goals¶

Running any local LLM (query expansion, rerank, summarization).
Replacing OpenClaw’s canonical memory slot by default.
Automatically treating all repo text as trusted (we will tag trust by source).

Corpus (initial)¶

We treat these as the high-value “docs memory” sources:

1) lyria-working-ledger/DECISIONS/** 2) lyria-working-ledger/REPORTS/captain-log/** 3) openclaw-async-coding-playbook project docs: - projects/** (roadmaps/specs/TECH_NOTES) - **/ROADMAP*.md, **/*SPEC*.md, **/*PRD*.md, **/TECH_NOTES/**

Notes: - We start narrow to keep index cost + risk low. - Expand later via an explicit allowlist (paths/globs).

Storage model (v0)¶

SQLite ledger (sidecar)¶

Add a new modality to the sidecar DB:

docs_chunks table (canonical chunk store)
docs_chunks_fts (FTS5) for lexical search
docs_embeddings (optional) for vector search

Chunk identity must be stable:

doc_id = stable file identity (e.g., repo + relpath)
chunk_id = stable within a file (e.g., heading anchor + ordinal)
recordRef should be derivable (e.g., doc:<repo>:<path>#<chunk_id>)

Embeddings provider¶

Use the configured OpenAI embeddings model (default: text-embedding-3-small).

Constraints: - Embeddings are optional; v0 must work with FTS-only. - Embedding failures are fail-open (do not break ingest/search).

Chunking (v0)¶

Prefer Markdown structure:
split on headings; keep short sections together up to a token/char cap
Store both:
text (the chunk)
title / heading_path (context)
path, repo, doc_kind (decision/roadmap/spec/log)
ts_hint (best-effort parse of dates from filename/frontmatter)

Keep a strict upper bound per chunk (so retrieval snippets stay safe).

Retrieval pipeline (v0)¶

1) Lexical retrieval (FTS; BM25-scored)¶

docs_search_fts(query, top_k) → lexical candidates, ranked by the FTS engine’s _score (often BM25-like).
Implementation detail (if using SQLite FTS5): ranking is typically BM25 via the built-in bm25() helper.

2) Vector retrieval (optional)¶

docs_search_vec(query, top_k, model) → candidates by cosine similarity.

3) Fusion¶

Deterministic RRF fusion of the two ranked lists.
Keep a bounded final candidate set (e.g., top 20).

4) Optional remote rerank (later)¶

If needed for quality, rerank only the fused top-N using a remote model.

Trust & provenance¶

Default trust policy for this corpus: - DECISIONS/** and operator-authored roadmaps/specs: trusted - Anything imported from tools/web: untrusted unless explicitly promoted

All docs chunks must carry: - source_kind (operator|tool|web|import|system) - source_ref (repo/path at minimum)

Receipts & trace (non-negotiable)¶

Every docs query should be debuggable.

A docs recall/pack trace should include: - query text - fts top-k (ids + scores) - vec top-k (ids + scores) - fused ranking (ids + score components) - final selected chunks (bounded text + recordRef)

Benchmark / measurable acceptance¶

Define a fixed query set (20–30) where each query has an expected target doc.

Metrics: - Hit@5 - Hit@1 - Hint-rate (fraction needing follow-up keyword hints)

Acceptance: - For decision-style queries (e.g., “neoapitest 現股當沖 status code”), Hit@5 should materially improve vs current.

Slow-cook lanes (recommended)¶

Add a slow-cook lane to: 1) refresh docs index (incremental) 2) run the benchmark query set 3) emit a single aggregate receipt + top failure cases

Notify only on regression or when new high-importance missing-coverage cases appear.