v1.4.0 · Technical Architecture

How Monobrain
Thinks

A self-learning Claude Code orchestration layer. Every interaction intercepted, routed, remembered, and improved. Here's the full picture.

8CJS Runtime Modules
14Hook Types
138MCP Tools
4Memory Tiers
26CLI Commands

Architecture

System Overview

All components wired through hook-handler.cjs — the central orchestrator that every Claude Code event flows through.

CLAUDE CODE (IDE / CLI)User submits prompt → Hook events fire (JSON via stdin/stdout) → Engineered response returnedHook eventshook-handler.cjsRuntime Coordinator · ~1000 linesmemory-palace.cjs4-Tier Memory · BM25 + KGdrawers.jsonl · closets.jsonl · kg.jsonrouter.cjs4-Tier Waterfall RoutingSkills · RouteLayer · Agentsintelligence.cjsJaccard Scoring · Patternsauto-memory-store.json · insightstoken-tracker.cjsJSONL · CostDashboard.monobrain/palace/drawers · closets · kg.json · identity.md60+ Agent Types · 26 SkillsRegex · Semantic · Specialist routing.monobrain/data/auto-memory-store · ranked-contextTypeScript Packages@monobrain/cli · @monobrain/memory · @monobrain/graph · @monobrain/hooks · @monobrain/security · @monobrain/guidance

Modules

Runtime Components

Eight CJS helpers form the actual running system — no compilation, no npm deps, pure Node.js built-ins.

Runtime Coordinator

hook-handler.cjs

Central dispatcher handling all 14 hook types. Wires every sub-system. On session-restore alone, it runs 8 sequential phases to construct the full prompt context Claude receives.

~1000 lines14 hook typesrunWithTimeoutsafeRequire
🏛

Persistent Memory

memory-palace.cjs

Cross-session memory with 4-tier retrieval hierarchy. L0 identity, L1 top-5 scored drawers, L2 namespace recall, L3 Okapi BM25 full-text. Zero AI calls — 100% local and deterministic.

~400 linesBM25 K1=1.5drawers.jsonlKG triplescloset boost
🧠

Pattern Intelligence

intelligence.cjs

Jaccard-scored context retrieval from learned patterns. At session-end, consolidates pending-insights.jsonl. Safety-guarded with 10MB file size and 5000 node limits.

~250 linesJaccard scoreconfidence-weighted13 categories
🔀

Agent Router

router.cjs

Multi-tier waterfall routing from natural language to optimal agent. Non-dev detection → regex patterns → semantic RouteLayer → keyword fallback. Writes last-route.json for statusline.

~275 lines4-tier waterfall60+ agents0.85 confidence
💰

Cost Tracking

token-tracker.cjs

Full codeburn pipeline port. Parses JSONL sessions, deduplicates subagent chains, calculates API costs with per-model pricing, renders ANSI dashboard with 6 panels. Auto-injected at session-restore.

~1100 linesJSONL parserUTC-safe13-categoryANSI dashboard
🔒

Safety Layer

pre-bash validation

Built into hook-handler. Blocks destructive shell patterns before Claude can execute them: rm -rf /, fork bombs, dd zero-fill. Returns {action:"block"} to Claude Code which prevents execution.

rm -rf /fork bombsdd /dev/zeroPreToolUse
💾

Session State

session.cjs

Manages .monobrain/sessions/current.json lifecycle. restore(), end(), and metric() calls track task counts and session duration. Archives on end to session-{id}.json.

current.jsonarchivedurationtask counter
🗂

Key-Value Store

memory.cjs

Simple flat-file key-value memory store. Backed by .monobrain/data/memory.json. Used for lightweight cross-session state that doesn't need full Memory Palace indexing.

memory.jsonget/set/delnamespace

Prompt Engineering

How a Prompt Gets Built

From raw user input to fully-engineered context — every step that fires before Claude sees your message.

1

Session Restore — Identity Injection

On every new conversation, session.restore() loads current.json. Memory Palace injects [MEMORY_PALACE_L0] — the static identity.md containing project name, stack, key packages, git remote, and working style. This is always the first thing Claude reads.

2

Essential Story — Top-5 Scored Drawers

From drawers.jsonl, the top-5 entries by score (within last 30 days) are injected as [MEMORY_PALACE_L1]. Score rises on retrieval frequency — frequently-useful memories bubble up automatically. No manual curation needed.

score bumped on every recall → high-frequency memories promoted to L1
3

Knowledge Base Preload

CLAUDE.md + docs/*.md are scanned, chunked, and keyword-indexed. Most relevant excerpts for this session are injected as [KNOWLEDGE_PRELOADED]. Shared agent instructions from .agents/shared_instructions.md are added as [SHARED_INSTRUCTIONS] (1500 char cap).

4

Token Cost Awareness

token-tracker.quickSummary() parses current month's JSONL session files, aggregates today + monthly spend, and injects as [TOKEN_USAGE]. Claude sees actual API spend before starting work. All timestamps handled in UTC to avoid timezone drift.

[TOKEN_USAGE] Today: $17.69 (414 calls) | Month: $2249.47 (39330 calls)
5

User Prompt Submitted → Route Hook Fires

Every UserPromptSubmit triggers the route hook. Intelligence scans auto-memory-store.json for Jaccard-scored relevant past patterns → injected as [INTELLIGENCE]. Router runs its 4-tier waterfall producing the routing panel Claude and the user see in terminal.

6

Task Complexity Scoring → Model Recommendation

Pre-task hook scores the task description 0–100 based on word count + keywords. Score <30% → Haiku recommended. Score >70% → Opus recommended. Injected as [TASK_MODEL_RECOMMENDATION]. This informs agent spawning decisions.

[TASK_MODEL_RECOMMENDATION] Use model="haiku" (score: 18/100)
7

Response + Post-Task Memory Storage

After each task completes, post-task fires. Task content is chunked into 800-char segments (100-char overlap) and stored in drawers.jsonl. Closet terms extracted via regex. KG triple added. Future sessions can recall what was done here.

memory-palace.storeVerbatim(cwd, taskContent, {wing:'tasks', room:'active'})
8

Session End → Consolidation + Archive

session-end triggers intelligence.consolidate() (clears pending-insights.jsonl), session.end() archives current.json, and Memory Palace stores a session-end temporal triple in kg.json with a valid_from timestamp.

Memory

Memory Palace — 4-Tier Hierarchy

Zero AI calls. Entirely deterministic and local. Inspired by MemPalace arXiv architecture.

L0

Identity — Always Loaded

Static .monobrain/palace/identity.md — project name, stack, key packages, working style, git remote. Injected verbatim as [MEMORY_PALACE_L0] on every session. Never auto-overwritten.

Always injected · ~500 chars
L1

Essential Story — Top-5 Scored Drawers

Reads drawers.jsonl, scores entries within last 30 days, picks top-5 by retrieval score. Auto-injected as [MEMORY_PALACE_L1]. Scores rise with every recall — high-frequency memories stay visible.

On session-restore · top-5 by score
L2

On-Demand Namespace Recall

recall(wing, room, limit) retrieves top-scored drawers from a specific namespace. Called explicitly by hook code or agents needing focused context. Wings: tasks, sessions, architecture, debugging, general.

On-demand · namespace-scoped
L3

Deep BM25 Full-Text Search

search(query, wing?, room?, limit?) runs Okapi BM25 (K1=1.5, B=0.75) across all drawers + closet term boost (+0.5 per matching topic). Most expensive but most comprehensive. Supports temporal KG queries via kgQuery().

Explicit call only · full corpus · BM25 + closet boost

BM25 Formula (K1=1.5, B=0.75, closet boost=+0.5)

score(d,q) = Σt∈q IDF(t) × (f(t,d) × 2.5) / (f(t,d) + 1.5 × (0.25 + 0.75 × |d|/avgdl))
IDF(t) = log((N − df(t) + 0.5) / (df(t) + 0.5) + 1)
final_score = bm25_score + Σ closet_boost(term) × 0.5

Routing

4-Tier Agent Router

Every user prompt traverses this waterfall from top to bottom — first match wins.

TIER 0
Non-Dev Specialist Detection

If prompt matches marketing · sales · product · UI design · blockchain · legal → routes to "extras" with confidence 0.85. Returns top-8 specialist agents immediately. Skips all remaining tiers.

0.85 conf
TIER 1
Task Pattern Regex Matching

TASK_PATTERNS checked in priority order: implement|create|build → coder (0.8). fix|debug|error → reviewer (0.85). test|spec → tester (0.8). document|explain → researcher (0.75). security|auth → security-architect (0.9).

0.75–0.9
TIER 2
Semantic RouteLayer

TF-IDF weighted keyword matching against agent capability profiles. Computes cosine similarity between prompt token vector and each agent's skill vector. Returns ranked list with similarity scores. Handles nuanced multi-concept prompts.

cosine sim
TIER 3
Keyword Fallback

Simple token overlap between prompt and agent descriptions. Last resort — always produces a result. Default confidence 0.5 with "Default routing" reason. Prevents routing failure for any input.

0.50 conf

Cost Tracking

Token Tracker Pipeline

Full codeburn port — parses native Claude Code JSONL sessions, deduplicates subagent chains, calculates real API cost.

📂
JSONL Discovery

~/.claude/projects/**/*.jsonl recursive scan, including subagent dirs

🔍
Parse & Dedup

Read assistant entries, deduplicate by msg.id across subagent chains

🏷
Classify Turn

13-category classifier: tools first, then keyword patterns

💵
Calculate Cost

multiplier × (in×rate + out×rate + cache reads/writes + webSearch)

📊
Render Dashboard

6 ANSI panels: Overview, Projects, Models, Daily chart, Tools, MCP

Cost Formula

cost = multiplier ×
  (inputTokens × in_rate
  + outputTokens × out_rate
  + cacheWrite × cw_rate
  + cacheRead × cr_rate
  + webSearch × $0.01)

Fast Mode Multipliers

Opus 4.6 (fast)
Sonnet 4.6
Haiku 4.5
GPT-4o

UTC Timezone Rule

JSONL timestamps are UTC ISO strings. Never use new Date(y,m,d) for date math — it's local time. Always use Date.UTC() or now.toISOString().slice(0,10).

Hook System

14 Hook Types

Claude Code fires these events at precise lifecycle moments — hook-handler.cjs handles every one.

session-restore

Fires on every conversation start. Runs 8 phases: restore → intelligence → workers → knowledge → instructions → memory palace → tokens → microagent index. Most complex handler.

session-end

Fires on conversation end. Consolidates insights, archives session JSON, stores temporal KG triple for the session boundary.

route (UserPromptSubmit)

Every user message. Intelligence context retrieval → semantic routing → MicroAgent scan → prints routing panel → writes last-route.json.

pre-task

Before task execution. Increments task counter. Scores complexity 0–100 → outputs [TASK_MODEL_RECOMMENDATION] for Haiku/Sonnet/Opus selection.

post-task

After task completion. Stores task content in Memory Palace (800-char chunks). Extracts closet terms. Saves routing pattern for future intelligence.

pre-edit / post-edit

pre-edit: reads file, suggests specialist agent. post-edit: records edit to pending-insights.jsonl via intelligence.recordEdit().

pre-bash (PreToolUse)

Safety validator. Blocks: rm -rf /, format c:, dd if=/dev/zero, fork bombs. Returns {action:"block"} — Claude Code prevents command execution.

load-agent

Reads .claude/agents/{slug}.md and prints full content — activates an agent's identity and capabilities for the current conversation turn.

notify

Sends system notifications for important events. Used by workers and the routing system to surface alerts without interrupting the main flow.

worker hooks

worker-dispatch, worker-status, worker-list, worker-cancel, worker-detect: manage 12 background intelligence workers (ultralearn, optimize, consolidate, predict, audit, etc.)

intelligence hooks

trajectory-start/step/end, pattern-store/search, attention, learn, stats: inner workings of the intelligence learning loop. Records edit patterns and trajectories.

transfer

Handles cross-session context transfer. Packages the current session's learned state for injection into a new conversation.

Performance

Speed & Scale

All numbers measured on macOS, Node.js 20, SSD-backed filesystem.

OperationComponentSpeedNote
L0 Identity injectionmemory-palace.cjs
<1ms
Single file read
L1 Top-5 drawersmemory-palace.cjs
5–15ms
JSONL parse + sort
L3 BM25 full searchmemory-palace.cjs
20–80ms
Per 1000 drawers
Jaccard context retrievalintelligence.cjs
<5ms
Per 500 entries
4-tier routingrouter.cjs
<3ms
Sync regex waterfall
Token quickSummarytoken-tracker.cjs
200–500ms
Per month of sessions
Full dashboard rendertoken-tracker.cjs
1–3s
All-time JSONL parse
Hook timeout guardhook-handler.cjs
3s max
runWithTimeout cap

ADR-026 — 3-Tier Model Routing

TIER 1

Agent Booster

WASM · <1ms · $0

Simple transforms — no LLM needed. Var→const, type additions.

TIER 2

Haiku 4.5

~500ms · $0.0002/req

Simple tasks, complexity score <30%. Most routine work.

TIER 3

Sonnet / Opus 4.6

2–5s · $0.003–0.015

Architecture, security, complex reasoning. Score >30%.