Go 1.25 · Rod CDP · DAG Engine

How Mono Agent
Orchestrates Workflows

52K+ lines of Go. A DAG workflow engine with 73 node types, Rod browser automation across 6 social platforms, multi-provider AI, and a Wails desktop UI.

73Node Types

52K+Lines of Go

6Social Platforms

40+DB Tables

20Max Workers

200+AI Models

Architecture

System Overview

WorkflowEngine is the central coordinator. Every trigger flows through the DAG executor, dispatching to the node registry. Browser, AI, and service nodes all share a single SQLite store.

Internal Packages

8 Core Modules

52,000+ lines of Go across 251 files. Clean separation between engine, execution, browser, and storage layers.

⚡

Workflow Engine

internal/workflow/

DAG-based execution orchestrator. Kahn's algorithm for topological sort and cycle detection. BFS stack execution — branches run sequentially in a single goroutine, not parallel goroutines. Manages execution queue (default depth 1,000), worker pool (default 3, max 20), and trigger registry (manual, cron, webhook).

5,269 LOCDAGKahn's algoBFS stackworker poolwebhook

🎬

Action Executor

internal/action/

Interprets 29 embedded JSON action definitions across 4 platforms (Instagram 8, LinkedIn 7, X 7, TikTok 7). Dispatches 17 step types: navigate, click, type, extract_text, extract_multiple, condition, loop, call_bot_method, save_data, mark_failed, and more. Supports nested loops with recursion guards.

5,236 LOC29 JSON defs17 step typesgo:embed{{template}}

🧩

Node Registry

internal/nodes/

73 files implementing 50+ node types. NodeTypeRegistry maps type string → factory function, enabling runtime plugin registration. Each node loads its JSON Schema from workflow/schemas/. Control nodes: IF, Switch, Merge, Wait, Loop. Service nodes: HTTP, DB, Email, Slack, GitHub, Linear, Stripe, and 20+ more.

~3,200 LOC73 files78 schemasplugin registryJSON Schema

🌐

Browser Layer

internal/browser/

Rod (Chrome DevTools Protocol via WebSocket). Stealth evasion via go-rod/stealth plugin. Humanized input: per-keystroke delays 20–150ms, 5% typo rate with auto-correct. 12+ Chromium flags for anti-detection. Page pool — one page per platform+session, reused to preserve login state.

~1,500 LOCRod/CDPstealthhumanizedpage pool

🤖

Platform Bots

internal/bot/

Platform-specific DOM navigators and data extractors for Instagram, LinkedIn, X/Twitter, TikTok, Telegram, and Email. Tier-1 in the 3-tier fallback: Go code (reliable, version-locked). Handles K/M number conversion (12.5K→12500), deduplication, and batch SQLite saves.

~2,800 LOC6 platformsK/M parsededupbatch saves

🗄

Storage

internal/storage/

SQLite via modernc.org/sqlite (pure Go, no CGO). 40+ tables covering workflows, executions, actions, people, lists, templates, threads, credentials, and platform sessions. Execution history pruning via cron (default: keep last 500 per workflow). JSON export for large-scale people data.

~2,200 LOC40+ tablespure Gono CGOJSON export

🔑

Auth & Connections

internal/connections/

OAuth2 manager for 30+ cloud services (Google, GitHub, Linear, Stripe, Salesforce, HubSpot, etc.). Platform session cookies persisted in SQLite, auto-cleanup on expiry. Manual browser login flow captures cookies. Credential vault for API keys and tokens.

~1,600 LOCOAuth230+ servicessession cookiesvault

🧠

AI Integration

internal/ai/

Multi-provider LLM support: Anthropic (Claude Opus/Sonnet/Haiku), OpenAI (GPT-4/3.5), Google Gemini, AWS Bedrock. Prompt caching for Anthropic. Per-workflow chat history for multi-turn conversations. Token usage and cost tracking per provider per execution.

~1,800 LOCAnthropicOpenAIGeminiBedrockprompt cache

Execution

Workflow Execution Flow

From trigger to result — 8 steps through the DAG executor, expression engine, and node registry.

Trigger Event

User fires monotask run <workflow-id>, clicks Run in Wails UI, a cron schedule fires (robfig/cron), or a webhook HTTP POST hits :9321/webhook/{id}. WorkflowEngine receives the trigger event and creates a WorkflowExecution record with QUEUED status.

DAG Construction

ExecutionQueue worker goroutine pops the request. WorkflowStore.ListNodes() and ListConnections() load the workflow graph. DAG.BuildDAG() constructs adjacency lists. Kahn's algorithm validates topological order and detects cycles (returning cycle node IDs if found at save time).

DAG.BuildDAG() → Kahn's topological sort → execution stack init

BFS Execution Loop

Main loop pops (node, inputItems) from the BFS stack. If disabled, skip. ExpressionEngine.ResolveConfig() evaluates {{$json.*}}, {{$node["Name"].*}}, {{env "..."}} in all string config fields. Node is dispatched to registry.Get(nodeType).Execute().

// Expression examples:
{{$json.username}}          → current item field
{{$node["Search"].json[0]}} → named node output
{{env "OPENAI_KEY"}}        → environment variable

Node Execution

Each node type handles its logic. Browser action nodes wrap ActionExecutor (loads JSON defs, runs 17 step types). Control nodes implement IF/Switch/Merge/Loop/Wait. Transform nodes (Set, Code) use expression engine. Service nodes (HTTP, Slack, GitHub) make external API calls. AI nodes invoke LLM providers.

3-Tier DOM Fallback

For browser action nodes needing DOM elements: Tier 1 = call_bot_method (Go code, reliable). Tier 2 = XPath alternatives (human-written, brittle but fast). Tier 3 = AI-generated CSS selector via config manager (adaptive but slow). Each tier skippable with onError: skip.

call_bot_method → XPath → AI selector (config manager)

Output Routing

Node emits outputs on named handles: main, error, true/false (IF), loop_item/done (Loop), or custom handles. Each connected downstream node is pushed onto the BFS stack. MERGE nodes accumulate inputs from multiple branches via sync.Mutex-guarded state map, releasing when all expected inputs arrive.

outputs["main"] → push connected nodes onto BFS stack

Error Handling

Per-node on_error policy: stop (mark execution FAILED, return immediately), continue (treat error as success, pass input through), error_branch (emit on error handle with structured NodeError). Execution history saved to workflow_execution_nodes table for debugging.

Execution Complete

Stack empties → COMPLETED. Stop-error node triggers → FAILED. WorkflowExecution record updated with final status, duration, error message. Result returned to caller (CLI prints summary, Wails UI refreshes, webhook HTTP response returned). History pruned to last 500 per workflow via background cron.

Node Registry

73 Node Types

Every node registers with NodeTypeRegistry → factory function mapping. Each loads its JSON Schema from workflow/schemas/. Plugins can register new types at runtime without recompiling.

Control & Transform

IFSwitchMergeWaitLoopSetCodeNoOp

Browser & Bots

InstagramLinkedInX/TwitterTikTokTelegramEmailChrome

Service Integrations

HTTP RequestSlackGitHubLinearStripeSalesforceHubSpot

AI Nodes

Anthropic ChatOpenAI PromptGeminiBedrockAI Extract

Data & Storage

SQLite QueryJSON ParseCSV ExportExcel WriteHTML Extract

3-Tier DOM Fallback (Browser Nodes)

Tier 1

call_bot_method— Go code in bot layer — reliable, version-locked, fastest

~100% reliable

Tier 2

XPath alternatives— Human-written selectors — faster than AI but brittle if platform changes UI

~85% reliable

Tier 3

AI-generated CSS— Config manager asks LLM for selector — adaptive but slow (~500ms extra)

~70% reliable

Expression Engine

Go text/template + FuncMap

All string config fields in every node are resolved through ExpressionEngine before execution. No JavaScript sandbox — pure Go templates with a controlled FuncMap.

{{$json.username}}

Field from the current item (output of previous node)

{{$node["Search"].json[0].title}}

Output of a specific named upstream node

{{$workflow.id}}

Workflow metadata (id, name, created_at)

{{$execution.id}}

Current execution runtime info

{{env "OPENAI_KEY"}}

OS environment variable (secure, not stored in workflow)

{{len $json.items}}

Array length — built-in template function

{{now}}

Current timestamp as Go time.Time

{{index $json.tags 0}}

Array index access via Go template built-in

Key Architectural Decision

Go text/template instead of Lua or JavaScript — no eval sandboxing required. Expressions are limited by the FuncMap (no arbitrary Go access). Simple enough for non-technical users, composable for power users. Go's template engine is battle-tested with zero external runtime overhead.

Performance

Execution Benchmarks

Go + single-goroutine BFS execution eliminates race conditions. Bottlenecks are always network-bound (browser, AI, APIs) not engine-bound.

Operation	Speed	Notes
Node dispatch overhead	<1ms	Registry lookup + expression resolve
Browser action (click)	50–500ms	Includes humanization delays
DOM extraction (goquery)	<10ms	Per element set
AI LLM call (Anthropic)	500ms–5s	Network latency dominates
SQLite batch write	<20ms	Per 100 people rows
Workflow DAG build	<5ms	Kahn's sort on 100 nodes
Webhook trigger response	<50ms	Enqueue + HTTP 202 response
Cron schedule resolution	<1ms	robfig/cron v3 in-process

Why BFS, Not Goroutines?

Browser actions are seconds-to-minutes. Spawning goroutines per branch adds overhead without benefit. Parallel execution uses the worker pool instead — multiple workflows run concurrently, not multiple branches within one.

Pure Go SQLite

modernc.org/sqlite is a CGO-free port of SQLite. Zero C compilation, single static binary, no system SQLite dependency. Slightly slower than CGO builds (~10%) but enables cross-compilation and Docker-free deployment.

Single Binary

go:embed packages all 29 action JSONs and 78 workflow schemas. No external config files needed. Wails embeds the React UI. The entire system — CLI, Wails app, action engine — ships as one ~50MB executable.

How Mono AgentOrchestrates Workflows