v1.10.54 · Technical Architecture

How Monomind
Thinks

A self-learning Claude Code orchestration layer. Every interaction intercepted, routed, remembered, and improved. Here's the full picture.

17+Packages

22Hook Events

230+Agent Types

4Memory Tiers

53+CLI Commands

Architecture

System Overview

All components wired through hook-handler.cjs — the central orchestrator that every Claude Code event flows through.

Modules

Runtime Components

17+ TypeScript packages form the running system — each managing a distinct functional domain.

⬡

Runtime Coordinator

hook-handler.cjs

Central dispatcher handling all 22 hook events. Wires every sub-system. On session-restore alone, it runs 8 sequential phases to construct the full prompt context Claude receives.

22 hook eventsrunWithTimeoutsafeRequiresession phases

🏛

Memory Palace

memory.cjs

Four-tier memory hierarchy (L0–L3) combining BM25 full-text with HNSW vector indexing. Persistent context across sessions via AgentDB backends. Cross-agent namespace sharing through PartitionedHNSW.

L0–L3 tiersBM25 K1=1.5HNSW vectorsAgentDBPartitionedHNSW

🧠

Neural Learning

sona.cjs

SONA adaptation engine with LoRA fine-tuning (rank 1–16) and EWC++ memory preservation preventing catastrophic forgetting. Six RL algorithms: PPO, DQN, A2C, DecisionTransformer, Q-Learning, SARSA.

LoRA rank 1–16EWC++PPO/DQN/A2CSARSA<0.05ms

🔀

Agent Router

router.cjs

Multi-tier waterfall routing from natural language to optimal agent across 230+ types. Non-dev detection → regex patterns → semantic RouteLayer → keyword fallback. Writes last-route.json for statusline.

4-tier waterfall230+ agents0.85 confidenceRouteLayer

🗺

Knowledge Graph

monograph

SQLite-backed code dependency graph with 30 MCP tools for impact analysis, path finding, and community detection. Queried automatically before every task — no manual invocation needed.

SQLite30 MCP toolsLouvain communitiesgod nodesBFD chunking

💰

Cost Tracking

token-tracker.cjs

Full codeburn pipeline port. Parses JSONL sessions, deduplicates subagent chains, calculates API costs with per-model pricing, renders ANSI dashboard with 6 panels. Auto-injected at session-restore.

JSONL parserUTC-safe13-categoryANSI dashboard

🔒

Safety Layer

pre-bash validation

Built into hook-handler. Blocks destructive shell patterns before Claude can execute them: rm -rf /, fork bombs, dd zero-fill. Returns {action:"block"} to Claude Code which prevents execution.

rm -rf /fork bombsdd /dev/zeroPreToolUse

💾

Session State

session.cjs

Manages .monomind/sessions/current.json lifecycle. restore(), end(), and metric() calls track task counts and session duration. Archives on end to session-{id}.json.

current.jsonarchivedurationtask counter

Prompt Engineering

How a Prompt Gets Built

From raw user input to fully-engineered context — every step that fires before Claude sees your message.

Session Restore — Identity Injection

On every new conversation, session.restore() loads current.json. Memory Palace injects [MEMORY_PALACE_L0] — the static identity.md containing project name, stack, key packages, git remote, and working style. This is always the first thing Claude reads.

Essential Story — Top-5 Scored Drawers

From drawers.jsonl, the top-5 entries by score (within last 30 days) are injected as [MEMORY_PALACE_L1]. Score rises on retrieval frequency — frequently-useful memories bubble up automatically. No manual curation needed.

score bumped on every recall → high-frequency memories promoted to L1

Knowledge Base Preload + Monograph

CLAUDE.md + docs/*.md are scanned, chunked, and keyword-indexed. Monograph automatically queries the code dependency graph — relevant files, god nodes, and impact paths are surfaced before Claude starts work.

Token Cost Awareness

token-tracker.quickSummary() parses current month's JSONL session files, aggregates today + monthly spend, and injects as [TOKEN_USAGE]. Claude sees actual API spend before starting work. All timestamps handled in UTC to avoid timezone drift.

[TOKEN_USAGE] Today: $17.69 (414 calls) | Month: $2249.47 (39330 calls)

User Prompt Submitted → Route Hook Fires

Every UserPromptSubmit triggers the route hook. Intelligence scans auto-memory-store.json for Jaccard-scored relevant past patterns → injected as [INTELLIGENCE]. Router runs its 4-tier waterfall producing the routing panel Claude and the user see in terminal.

Task Complexity Scoring → Model Recommendation

Pre-task hook scores the task description 0–100 based on word count + keywords. Score <30% → Haiku recommended. Score >70% → Opus recommended. Injected as [TASK_MODEL_RECOMMENDATION]. This informs agent spawning decisions.

[TASK_MODEL_RECOMMENDATION] Use model="haiku" (score: 18/100)

Response + Post-Task Memory Storage

After each task completes, post-task fires. Task content is chunked into 800-char segments (100-char overlap) and stored in drawers.jsonl. Closet terms extracted via regex. KG triple added. Future sessions can recall what was done here.

memory.storeVerbatim(cwd, taskContent, {wing:'tasks', room:'active'})

Session End → Consolidation + Archive

session-end triggers intelligence.consolidate() (clears pending-insights.jsonl), session.end() archives current.json, and Memory Palace stores a session-end temporal triple in kg.json with a valid_from timestamp.

Memory

Memory Palace — 4-Tier Hierarchy

BM25 precision + HNSW vector recall. Entirely local and deterministic. Inspired by MemPalace arXiv architecture.

Identity — Always Loaded

Static .monomind/palace/identity.md — project name, stack, key packages, working style, git remote. Injected verbatim as [MEMORY_PALACE_L0] on every session. Never auto-overwritten.

Always injected · ~500 chars

Essential Story — Top-5 Scored Drawers

Reads drawers.jsonl, scores entries within last 30 days, picks top-5 by retrieval score. Auto-injected as [MEMORY_PALACE_L1]. Scores rise with every recall — high-frequency memories stay visible.

On session-restore · top-5 by score

On-Demand Namespace Recall

recall(wing, room, limit) retrieves top-scored drawers from a specific namespace. Called explicitly by hook code or agents needing focused context. Wings: tasks, sessions, architecture, debugging, general.

On-demand · namespace-scoped

Deep BM25 + HNSW Full Search

search(query, wing?, room?, limit?) runs Okapi BM25 (K1=1.5, B=0.75) across all drawers + HNSW vector indexing for semantic recall. 150×–12,500× faster than naive search. Supports temporal KG queries via kgQuery().

Explicit call only · full corpus · BM25 + HNSW

BM25 Formula (K1=1.5, B=0.75) + HNSW Vector Index

score(d,q) = Σ_t∈q IDF(t) × (f(t,d) × 2.5) / (f(t,d) + 1.5 × (0.25 + 0.75 × |d|/avgdl))
IDF(t) = log((N − df(t) + 0.5) / (df(t) + 0.5) + 1)
final_score = bm25_score + hnsw_cosine_similarity + Σ closet_boost(term) × 0.5

Routing

4-Tier Agent Router

Every user prompt traverses this waterfall from top to bottom — first match wins across 230+ agent types.

TIER 0

Non-Dev Specialist Detection

If prompt matches marketing · sales · product · UI design · blockchain · legal → routes to "extras" with confidence 0.85. Returns top-8 specialist agents immediately. Skips all remaining tiers.

0.85 conf

↓

TIER 1

Task Pattern Regex Matching

0.75–0.9

↓

TIER 2

Semantic RouteLayer

TF-IDF weighted keyword matching against agent capability profiles. Computes cosine similarity between prompt token vector and each agent's skill vector. Returns ranked list with similarity scores. Handles nuanced multi-concept prompts.

cosine sim

↓

TIER 3

Keyword Fallback

Simple token overlap between prompt and agent descriptions. Last resort — always produces a result. Default confidence 0.5 with "Default routing" reason. Prevents routing failure for any input.

0.50 conf

Cost Tracking

Token Tracker Pipeline

Full codeburn port — parses native Claude Code JSONL sessions, deduplicates subagent chains, calculates real API cost.

📂

JSONL Discovery

~/.claude/projects/**/*.jsonl recursive scan, including subagent dirs

→

🔍

Parse & Dedup

Read assistant entries, deduplicate by msg.id across subagent chains

→

🏷

Classify Turn

13-category classifier: tools first, then keyword patterns

→

💵

Calculate Cost

multiplier × (in×rate + out×rate + cache reads/writes + webSearch)

→

📊

Render Dashboard

6 ANSI panels: Overview, Projects, Models, Daily chart, Tools, MCP

Cost Formula

cost = multiplier ×
  (inputTokens × in_rate
  + outputTokens × out_rate
  + cacheWrite × cw_rate
  + cacheRead × cr_rate
  + webSearch × $0.01)

Fast Mode Multipliers

Opus 4.8 (fast)6×

Sonnet 4.61×

Haiku 4.51×

GPT-4o1×

UTC Timezone Rule

JSONL timestamps are UTC ISO strings. Never use new Date(y,m,d) for date math — it's local time. Always use Date.UTC() or now.toISOString().slice(0,10).

Hook System

22 Hook Events

Claude Code fires these events at precise lifecycle moments — hook-handler.cjs handles every one.

session-restore

Fires on every conversation start. Runs 8 phases: restore → intelligence → workers → knowledge → instructions → memory palace → tokens → monograph index. Most complex handler.

session-end

Fires on conversation end. Consolidates insights, archives session JSON, stores temporal KG triple for the session boundary.

route (UserPromptSubmit)

Every user message. Intelligence context retrieval → semantic routing → MicroAgent scan → prints routing panel → writes last-route.json.

pre-task

Before task execution. Increments task counter. Scores complexity 0–100 → outputs [TASK_MODEL_RECOMMENDATION] for Haiku/Sonnet/Opus selection.

post-task

After task completion. Stores task content in Memory Palace (800-char chunks). Extracts closet terms. Saves routing pattern for future intelligence.

pre-edit / post-edit

pre-edit: reads file, suggests specialist agent. post-edit: records edit to pending-insights.jsonl via intelligence.recordEdit().

pre-bash (PreToolUse)

Safety validator. Blocks: rm -rf /, format c:, dd if=/dev/zero, fork bombs. Returns {action:"block"} — Claude Code prevents command execution.

load-agent

Reads .claude/agents/{slug}.md and prints full content — activates an agent's identity and capabilities for the current conversation turn.

notify

Sends system notifications for important events. Used by workers and the routing system to surface alerts without interrupting the main flow.

worker hooks

worker-dispatch, worker-status, worker-list, worker-cancel, worker-detect: manage 10 background intelligence workers (security, health, swarm, performance, patterns, learning, git, DDD, ADR, cache).

intelligence hooks

trajectory-start/step/end, pattern-store/search, attention, learn, stats: inner workings of the intelligence learning loop. Records edit patterns and trajectories.

transfer

Handles cross-session context transfer. Packages the current session's learned state for injection into a new conversation.

Performance

Speed & Scale

All numbers measured on macOS, Node.js 20, SSD-backed filesystem.

Operation	Component	Speed	Note
L0 Identity injection	memory.cjs	<1ms	Single file read
L1 Top-5 drawers	memory.cjs	5–15ms	JSONL parse + sort
L3 HNSW vector search	memory.cjs	<1ms	150×–12,500× vs naive
L3 BM25 full search	memory.cjs	20–80ms	Per 1000 drawers
Jaccard context retrieval	intelligence.cjs	<5ms	Per 500 entries
4-tier routing	router.cjs	<3ms	Sync regex waterfall
Token quickSummary	token-tracker.cjs	200–500ms	Per month of sessions
Hook timeout guard	hook-handler.cjs	3s max	runWithTimeout cap

ADR-026 — 3-Tier Model Routing

TIER 1

Agent Booster

WASM · <1ms · $0

Simple transforms — no LLM needed. Var→const, type additions.

TIER 2

Haiku 4.5

~500ms · $0.0002/req

Simple tasks, complexity score <30%. Most routine work.

TIER 3

Sonnet / Opus 4.8

2–5s · $0.003–0.015

Architecture, security, complex reasoning. Score >30%.

How MonomindThinks

System Overview

Runtime Components

hook-handler.cjs

memory.cjs

sona.cjs

router.cjs

monograph

token-tracker.cjs

pre-bash validation

session.cjs

How a Prompt Gets Built

Session Restore — Identity Injection

Essential Story — Top-5 Scored Drawers

Knowledge Base Preload + Monograph

Token Cost Awareness

User Prompt Submitted → Route Hook Fires

Task Complexity Scoring → Model Recommendation

Response + Post-Task Memory Storage

Session End → Consolidation + Archive

Memory Palace — 4-Tier Hierarchy

Identity — Always Loaded

Essential Story — Top-5 Scored Drawers

On-Demand Namespace Recall

Deep BM25 + HNSW Full Search

4-Tier Agent Router

Non-Dev Specialist Detection

Task Pattern Regex Matching

Semantic RouteLayer

Keyword Fallback

Token Tracker Pipeline

JSONL Discovery

Parse & Dedup

Classify Turn

Calculate Cost

Render Dashboard

22 Hook Events

Speed & Scale

How Monomind
Thinks