Mission Brain

Open-source second-brain template.

Refusal is the feature.

StatusPublic PatternThree-layer LLM-Wiki InvariantsFour, enforced

An open-source second-brain template. RayBrain is Ray's private instance, built over a private personal corpus. Mission Brain surfaces cited passages from prior user output. It does not synthesize. It does not rephrase. In the query path it does not generate text at all. It retrieves, cites, and stops.

Most second-brain tools conflate two jobs: finding what you already wrote and generating new text on top of it. The second job contaminates the first: once the tool starts writing, the user stops reading. Mission Brain separates them architecturally. The query engine cannot import a generative LLM client. It is enforced by a module-boundary AST scan, not a prompt.

The architecture

Mission Brain follows Andrej Karpathy's three-layer LLM-Wiki pattern:

Raw sources. A folder of documents. The authoritative corpus. Nothing is ever rewritten here.
Wiki. LLM-maintained markdown pages that synthesize across the sources. Every paragraph carries a citation marker pointing back to a specific passage.
Schema. A single file the ingest LLM reads as in-context instruction before emitting any wiki page. The schema encodes the retrieval-only and citation-first invariants as rules the model cannot bypass.

Ingest is the only phase where the LLM generates. Query does not generate. Lint does not generate.

The four invariants

Each invariant maps to a code-enforced rule plus a named test. The test failing is the verification gate rejecting the change. Prompts can be ignored; Boolean gates cannot.

Citation floor. Every non-empty paragraph in every wiki page contains at least one well-formed citation marker. Headings and code blocks are exempt. They do not make claims. A citation-less paragraph is a hallucination by definition and will not be written.
Idempotent regeneration. Running ingest twice over the same corpus and config produces byte-identical output. There is no hidden state in the synthesis layer; the raw corpus remains the single source of truth.
User-editable, user-rejectable. Hand edits to wiki pages live in a sibling overlay file. The next ingest never clobbers them. Hand edits are the source of truth on synthesis quality; the LLM defers.
Query-time co-visibility. The query API always returns both the synthesized wiki page and its raw citations. There is no endpoint that surfaces a page without its supporting anchors. A result containing a page but an empty citation list is broken by definition.

Refusal is the feature

If a query asks Mission Brain to synthesize, "summarize my three takes on X", the right response is:

Retrieval returned N passages across M sources. Here they are, each with its citation. Synthesis is out of scope by design. Read the passages and decide.

Refusal preserves the thing the tool is for: surfacing what the user actually wrote, so the user can read it and think. The moment the tool starts paraphrasing, the user stops doing the thinking. That failure mode is load-bearing enough to justify architectural refusal, not a configurable setting.

Citation grammar

Every claim in every wiki page carries a marker of the form [ref:<source_id>:<locator>], where the locator is one of a line range, a page number, an audio timestamp, or a structured paragraph anchor. Multiple citations per paragraph are allowed and encouraged. One anchor per paragraph is the floor, not the target.

Relationship to GeneralStaff

Mission Brain and GeneralStaff are the two sides of the same infrastructure. GeneralStaff handles execution: autonomous coding cycles with a verification gate that catches false completions. Mission Brain handles memory: retrieval that refuses to hallucinate on top of the corpus. Both answer the same Hammerstein question from different angles: how do you prevent industrious-without-judgment failure when the industrious agent is a language model?

Full framework: Von Hammerstein's Ghost, In Daily Use. Why the refusal is architectural, not prompt-based: Boolean Gates, Not Prompts.

How to follow

The RayBrain corpus is private and will remain so. Mission Brain is the open-source template: schema, invariants, enforcement sites, test harness. The repository is public at github.com/lerugray/mission-brain. The fastest way to reach me is lerugray@gmail.com.