v1.10.54 · Technical Architecture

How Monomind
Thinks

A self-learning Claude Code orchestration layer. Every interaction intercepted, routed, remembered, and improved. Here's the full picture.

17+Packages
22Hook Events
230+Agent Types
4Memory Tiers
53+CLI Commands

Architecture

System Overview

All components wired through hook-handler.cjs — the central orchestrator that every Claude Code event flows through.

CLAUDE CODE (IDE / CLI)User submits prompt → Hook events fire (JSON via stdin/stdout) → Engineered response returnedHook eventshook-handler.cjsRuntime Coordinator · 22 Hook Eventsmemory.cjs4-Tier Memory · BM25 + HNSWdrawers.jsonl · AgentDB · PartitionedHNSWrouter.cjs4-Tier Waterfall · 230+ AgentsSkills · RouteLayer · Specialistssona.cjsLoRA · EWC++ · 6 RL AlgorithmsPPO · DQN · A2C · Q-Learningtoken-tracker.cjsJSONL · CostDashboard.monomind/palace/drawers · closets · kg.json · identity.md230+ Agent Types · 110 SkillsRegex · Semantic · Specialist routing.monomind/neural/LoRA weights · EWC++ state · RL modelsTypeScript Packages (17+)@monomind/cli · @monomind/memory · @monomind/hooks · @monomind/swarm · @monomind/neural · monograph

Modules

Runtime Components

17+ TypeScript packages form the running system — each managing a distinct functional domain.

Runtime Coordinator

hook-handler.cjs

Central dispatcher handling all 22 hook events. Wires every sub-system. On session-restore alone, it runs 8 sequential phases to construct the full prompt context Claude receives.

22 hook eventsrunWithTimeoutsafeRequiresession phases
🏛

Memory Palace

memory.cjs

Four-tier memory hierarchy (L0–L3) combining BM25 full-text with HNSW vector indexing. Persistent context across sessions via AgentDB backends. Cross-agent namespace sharing through PartitionedHNSW.

L0–L3 tiersBM25 K1=1.5HNSW vectorsAgentDBPartitionedHNSW
🧠

Neural Learning

sona.cjs

SONA adaptation engine with LoRA fine-tuning (rank 1–16) and EWC++ memory preservation preventing catastrophic forgetting. Six RL algorithms: PPO, DQN, A2C, DecisionTransformer, Q-Learning, SARSA.

LoRA rank 1–16EWC++PPO/DQN/A2CSARSA<0.05ms
🔀

Agent Router

router.cjs

Multi-tier waterfall routing from natural language to optimal agent across 230+ types. Non-dev detection → regex patterns → semantic RouteLayer → keyword fallback. Writes last-route.json for statusline.

4-tier waterfall230+ agents0.85 confidenceRouteLayer
🗺

Knowledge Graph

monograph

SQLite-backed code dependency graph with 30 MCP tools for impact analysis, path finding, and community detection. Queried automatically before every task — no manual invocation needed.

SQLite30 MCP toolsLouvain communitiesgod nodesBFD chunking
💰

Cost Tracking

token-tracker.cjs

Full codeburn pipeline port. Parses JSONL sessions, deduplicates subagent chains, calculates API costs with per-model pricing, renders ANSI dashboard with 6 panels. Auto-injected at session-restore.

JSONL parserUTC-safe13-categoryANSI dashboard
🔒

Safety Layer

pre-bash validation

Built into hook-handler. Blocks destructive shell patterns before Claude can execute them: rm -rf /, fork bombs, dd zero-fill. Returns {action:"block"} to Claude Code which prevents execution.

rm -rf /fork bombsdd /dev/zeroPreToolUse
💾

Session State

session.cjs

Manages .monomind/sessions/current.json lifecycle. restore(), end(), and metric() calls track task counts and session duration. Archives on end to session-{id}.json.

current.jsonarchivedurationtask counter

Prompt Engineering

How a Prompt Gets Built

From raw user input to fully-engineered context — every step that fires before Claude sees your message.

1

Session Restore — Identity Injection

On every new conversation, session.restore() loads current.json. Memory Palace injects [MEMORY_PALACE_L0] — the static identity.md containing project name, stack, key packages, git remote, and working style. This is always the first thing Claude reads.

2

Essential Story — Top-5 Scored Drawers

From drawers.jsonl, the top-5 entries by score (within last 30 days) are injected as [MEMORY_PALACE_L1]. Score rises on retrieval frequency — frequently-useful memories bubble up automatically. No manual curation needed.

score bumped on every recall → high-frequency memories promoted to L1
3

Knowledge Base Preload + Monograph

CLAUDE.md + docs/*.md are scanned, chunked, and keyword-indexed. Monograph automatically queries the code dependency graph — relevant files, god nodes, and impact paths are surfaced before Claude starts work.

4

Token Cost Awareness

token-tracker.quickSummary() parses current month's JSONL session files, aggregates today + monthly spend, and injects as [TOKEN_USAGE]. Claude sees actual API spend before starting work. All timestamps handled in UTC to avoid timezone drift.

[TOKEN_USAGE] Today: $17.69 (414 calls) | Month: $2249.47 (39330 calls)
5

User Prompt Submitted → Route Hook Fires

Every UserPromptSubmit triggers the route hook. Intelligence scans auto-memory-store.json for Jaccard-scored relevant past patterns → injected as [INTELLIGENCE]. Router runs its 4-tier waterfall producing the routing panel Claude and the user see in terminal.

6

Task Complexity Scoring → Model Recommendation

Pre-task hook scores the task description 0–100 based on word count + keywords. Score <30% → Haiku recommended. Score >70% → Opus recommended. Injected as [TASK_MODEL_RECOMMENDATION]. This informs agent spawning decisions.

[TASK_MODEL_RECOMMENDATION] Use model="haiku" (score: 18/100)
7

Response + Post-Task Memory Storage

After each task completes, post-task fires. Task content is chunked into 800-char segments (100-char overlap) and stored in drawers.jsonl. Closet terms extracted via regex. KG triple added. Future sessions can recall what was done here.

memory.storeVerbatim(cwd, taskContent, {wing:'tasks', room:'active'})
8

Session End → Consolidation + Archive

session-end triggers intelligence.consolidate() (clears pending-insights.jsonl), session.end() archives current.json, and Memory Palace stores a session-end temporal triple in kg.json with a valid_from timestamp.

Memory

Memory Palace — 4-Tier Hierarchy

BM25 precision + HNSW vector recall. Entirely local and deterministic. Inspired by MemPalace arXiv architecture.

L0

Identity — Always Loaded

Static .monomind/palace/identity.md — project name, stack, key packages, working style, git remote. Injected verbatim as [MEMORY_PALACE_L0] on every session. Never auto-overwritten.

Always injected · ~500 chars
L1

Essential Story — Top-5 Scored Drawers

Reads drawers.jsonl, scores entries within last 30 days, picks top-5 by retrieval score. Auto-injected as [MEMORY_PALACE_L1]. Scores rise with every recall — high-frequency memories stay visible.

On session-restore · top-5 by score
L2

On-Demand Namespace Recall

recall(wing, room, limit) retrieves top-scored drawers from a specific namespace. Called explicitly by hook code or agents needing focused context. Wings: tasks, sessions, architecture, debugging, general.

On-demand · namespace-scoped
L3

Deep BM25 + HNSW Full Search

search(query, wing?, room?, limit?) runs Okapi BM25 (K1=1.5, B=0.75) across all drawers + HNSW vector indexing for semantic recall. 150×–12,500× faster than naive search. Supports temporal KG queries via kgQuery().

Explicit call only · full corpus · BM25 + HNSW

BM25 Formula (K1=1.5, B=0.75) + HNSW Vector Index

score(d,q) = Σt∈q IDF(t) × (f(t,d) × 2.5) / (f(t,d) + 1.5 × (0.25 + 0.75 × |d|/avgdl))
IDF(t) = log((N − df(t) + 0.5) / (df(t) + 0.5) + 1)
final_score = bm25_score + hnsw_cosine_similarity + Σ closet_boost(term) × 0.5

Routing

4-Tier Agent Router

Every user prompt traverses this waterfall from top to bottom — first match wins across 230+ agent types.

TIER 0
Non-Dev Specialist Detection

If prompt matches marketing · sales · product · UI design · blockchain · legal → routes to "extras" with confidence 0.85. Returns top-8 specialist agents immediately. Skips all remaining tiers.

0.85 conf
TIER 1
Task Pattern Regex Matching

TASK_PATTERNS checked in priority order: implement|create|build → coder (0.8). fix|debug|error → reviewer (0.85). test|spec → tester (0.8). document|explain → researcher (0.75). security|auth → security-architect (0.9).

0.75–0.9
TIER 2
Semantic RouteLayer

TF-IDF weighted keyword matching against agent capability profiles. Computes cosine similarity between prompt token vector and each agent's skill vector. Returns ranked list with similarity scores. Handles nuanced multi-concept prompts.

cosine sim
TIER 3
Keyword Fallback

Simple token overlap between prompt and agent descriptions. Last resort — always produces a result. Default confidence 0.5 with "Default routing" reason. Prevents routing failure for any input.

0.50 conf

Cost Tracking

Token Tracker Pipeline

Full codeburn port — parses native Claude Code JSONL sessions, deduplicates subagent chains, calculates real API cost.

📂
JSONL Discovery

~/.claude/projects/**/*.jsonl recursive scan, including subagent dirs

🔍
Parse & Dedup

Read assistant entries, deduplicate by msg.id across subagent chains

🏷
Classify Turn

13-category classifier: tools first, then keyword patterns

💵
Calculate Cost

multiplier × (in×rate + out×rate + cache reads/writes + webSearch)

📊
Render Dashboard

6 ANSI panels: Overview, Projects, Models, Daily chart, Tools, MCP

Cost Formula

cost = multiplier ×
  (inputTokens × in_rate
  + outputTokens × out_rate
  + cacheWrite × cw_rate
  + cacheRead × cr_rate
  + webSearch × $0.01)

Fast Mode Multipliers

Opus 4.8 (fast)
Sonnet 4.6
Haiku 4.5
GPT-4o

UTC Timezone Rule

JSONL timestamps are UTC ISO strings. Never use new Date(y,m,d) for date math — it's local time. Always use Date.UTC() or now.toISOString().slice(0,10).

Hook System

22 Hook Events

Claude Code fires these events at precise lifecycle moments — hook-handler.cjs handles every one.

session-restore

Fires on every conversation start. Runs 8 phases: restore → intelligence → workers → knowledge → instructions → memory palace → tokens → monograph index. Most complex handler.

session-end

Fires on conversation end. Consolidates insights, archives session JSON, stores temporal KG triple for the session boundary.

route (UserPromptSubmit)

Every user message. Intelligence context retrieval → semantic routing → MicroAgent scan → prints routing panel → writes last-route.json.

pre-task

Before task execution. Increments task counter. Scores complexity 0–100 → outputs [TASK_MODEL_RECOMMENDATION] for Haiku/Sonnet/Opus selection.

post-task

After task completion. Stores task content in Memory Palace (800-char chunks). Extracts closet terms. Saves routing pattern for future intelligence.

pre-edit / post-edit

pre-edit: reads file, suggests specialist agent. post-edit: records edit to pending-insights.jsonl via intelligence.recordEdit().

pre-bash (PreToolUse)

Safety validator. Blocks: rm -rf /, format c:, dd if=/dev/zero, fork bombs. Returns {action:"block"} — Claude Code prevents command execution.

load-agent

Reads .claude/agents/{slug}.md and prints full content — activates an agent's identity and capabilities for the current conversation turn.

notify

Sends system notifications for important events. Used by workers and the routing system to surface alerts without interrupting the main flow.

worker hooks

worker-dispatch, worker-status, worker-list, worker-cancel, worker-detect: manage 10 background intelligence workers (security, health, swarm, performance, patterns, learning, git, DDD, ADR, cache).

intelligence hooks

trajectory-start/step/end, pattern-store/search, attention, learn, stats: inner workings of the intelligence learning loop. Records edit patterns and trajectories.

transfer

Handles cross-session context transfer. Packages the current session's learned state for injection into a new conversation.

Performance

Speed & Scale

All numbers measured on macOS, Node.js 20, SSD-backed filesystem.

OperationComponentSpeedNote
L0 Identity injectionmemory.cjs
<1ms
Single file read
L1 Top-5 drawersmemory.cjs
5–15ms
JSONL parse + sort
L3 HNSW vector searchmemory.cjs
<1ms
150×–12,500× vs naive
L3 BM25 full searchmemory.cjs
20–80ms
Per 1000 drawers
Jaccard context retrievalintelligence.cjs
<5ms
Per 500 entries
4-tier routingrouter.cjs
<3ms
Sync regex waterfall
Token quickSummarytoken-tracker.cjs
200–500ms
Per month of sessions
Hook timeout guardhook-handler.cjs
3s max
runWithTimeout cap

ADR-026 — 3-Tier Model Routing

TIER 1

Agent Booster

WASM · <1ms · $0

Simple transforms — no LLM needed. Var→const, type additions.

TIER 2

Haiku 4.5

~500ms · $0.0002/req

Simple tasks, complexity score <30%. Most routine work.

TIER 3

Sonnet / Opus 4.8

2–5s · $0.003–0.015

Architecture, security, complex reasoning. Score >30%.