Architecture
How Nitpick's pieces fit together — components, data flow, MCP surface, extension points.
Design principles
- Single-process agent. All orchestration runs in one Node.js process.
- Bundled skills.
prd-e2e-orchestrator,qa-team(20 skills), andmulti-role-e2eship inskills/. - Pluggable providers. Anthropic, OpenAI, Claude Code all implement the same
Providerinterface. - Dual MCP surface. The KB is reachable two ways: in-process
kb_*function calls for Anthropic/OpenAI loops; stdio MCP server for Claude Code, Claude Desktop, Cursor, Cline, and any MCP client. - LLM-backed routing, deterministic execution. Scope inference and exploration use the LLM. Crawler, smoker, and KB are pure TypeScript.
- Persistent knowledge. Every run writes to
knowledge-base/. Every next run benefits. - HITL at decision points, not work items. Scope confirmation (1 decision), page understanding (1 per page). Skippable in non-interactive CI mode.
Component map
src/
├── cli/
│ ├── index.ts CLI entry (commander) — init, run, crawl, mcp, skill, serve, prep
│ └── init.ts `nitpick init` wizard
├── server/
│ └── index.ts HTTP API (Express) — async jobs, polling, HITL input
├── core/
│ ├── orchestrator.ts Senior QA phases, routing, honesty gate, error handling
│ ├── scope-decider.ts Iterative-vs-full branching + HITL UI
│ ├── scope-inferrer.ts LLM-backed "what changed → which pages"
│ └── workflow.ts Phase state machine + persistence
├── agent/
│ ├── runtime.ts Agent task runner — provider + skill + tools + KB context
│ ├── skill-loader.ts Reads SKILL.md + flow.md
│ └── tools.ts bash, read_file, write_file, list_files, ask_user, kb_* (8 tools)
├── mcp/
│ ├── kb-server.ts MCP stdio server (5 read + 3 write tools via McpServer)
│ ├── kb-runtime-tools.ts In-process kb_* tool definitions + executeKbTool() dispatcher
│ └── trust.ts TrustTier enum: read-write | read-only | deny
├── providers/
│ ├── provider.ts Interface + registry
│ ├── anthropic-provider.ts @anthropic-ai/sdk + prompt caching
│ ├── openai-provider.ts openai SDK + Skills (Responses API) + parallel tool calls
│ ├── claude-code-provider.ts subprocess wrapper + MCP config injection
│ ├── retry.ts Exponential backoff for 429/5xx/network
│ └── index.ts Factory
├── agents/
│ ├── crawler.ts SPA-aware page discovery
│ ├── crawler-llm-assist.ts LLM-driven crawler (SPA escape hatch)
│ ├── page-tester.ts Wraps delegatePageTest
│ ├── flow-tester.ts Wraps delegateFlowTest
│ ├── smoker.ts Fast regression check
│ └── reporter.ts Aggregates into unified HTML/MD/JSON
├── claude/
│ └── delegate.ts Builds skill tasks, calls runAgentTask, headless directive
├── kb/
│ ├── manager.ts KnowledgeBase class
│ └── diff-engine.ts Model diffing
└── utils/
├── config.ts YAML loader + env resolution
├── types.ts Shared Zod schemas + TS types (McpConfigSchema, CrawlConfigSchema)
├── logger.ts pino
├── fs-utils.ts small helpers
└── prompts.ts readline-based HITL prompts
MCP architecture
Nitpick exposes its knowledge base over two parallel surfaces — same KnowledgeBase class underneath both.
External MCP clients Internal agent loops
(Claude Code, Cursor, Cline) (Anthropic, OpenAI providers)
stdio JSON-RPC in-process function calls
│ │
kb-server.ts kb-runtime-tools.ts
│ │
└──────────── KnowledgeBase ──────────┘
kb/manager.ts
./knowledge-base/
External surface (nitpick mcp serve): McpServer registers 8 tools on a StdioServerTransport. Trust tier is read from config; tools that require write trust are only registered when trust: read-write.
Internal surface (agent loops): defaultTools({ with_kb: true }) adds the same 8 tools to the agent’s tool list as standard function call definitions. executeTool() routes kb_* calls to executeKbTool(). Used by Anthropic and OpenAI providers (not Claude Code, which gets stdio MCP).
Claude Code provider: writes a temp mcp.json to os.tmpdir() with the KB server entry pointing to an absolute kb_path (relative paths would resolve to the run subdir, not the project root). Passes --mcp-config to the claude -p subprocess.
Data flow: a typical run
- User sends a JobRequest via CLI/HTTP/skill
- Senior QA checks KB for page graph: if missing, runs Crawler (deterministic or LLM-assist)
- Senior QA decides scope: LLM-backed inference for iterative; direct for others
- User confirms scope (HITL checkpoint — skipped in non-interactive mode)
- For each page: runs Junior QA via
runAgentTask→Provider.runTask→ tool-use loop. Allkb_*calls go directly to the KB (in-process) or via MCP (Claude Code). - Phase 3.5 kickoff: Junior QA presents page model summary; user approves or edits. In non-interactive mode, agent self-writes
approved.jsonand proceeds. - For each flow: runs Flow Lead via same pathway
- Smoker runs on critical-not-in-scope pages
- Reporter aggregates into unified HTML/MD/JSON
- KB is updated: new models archived, flaky registry updated, bugs deduped,
kb_*write tools record findings
State files
Every run persists to runs/<run_id>/:
runs/<run_id>/
├── job-state.json Workflow state, errors
├── scope.json Confirmed pages + flows
├── pages/<page_id>/
│ ├── agent-trace.jsonl Every tool call
│ ├── agent-transcript.md Readable transcript
│ ├── derived-ui-model/
│ │ ├── evidence.json Honesty gate input
│ │ └── model.json Derived UI Model
│ ├── approved.json Phase 3.5 approval (HITL or headless)
│ ├── tests/
│ └── reports/
├── flows/<flow_name>/
│ └── (same structure)
├── smoke/smoke-results.json
└── reports/
├── unified-report.html
├── summary.md
└── data.json
Extension points
New LLM provider
- Implement
Providerinsrc/providers/<name>-provider.ts - Register in
src/providers/index.ts - Add to the
llm.providerunion insrc/utils/types.ts
New MCP tool
- Define in
src/mcp/kb-runtime-tools.ts(in-process) andsrc/mcp/kb-server.ts(stdio) - Add to
isKbTool()predicate - Add the tool schema to
defaultTools()insrc/agent/tools.ts
New agent
- Create
src/agents/<agent>.tswithrun<Agent>(opts) - Call from
orchestrator.ts - Add results to
UnifiedReportintypes.ts
Non-goals
- Not reimplementing Playwright
- Not owning retry classification (the skill’s Phase 5 handles that)
- Not hosting LLMs
- Not parsing source ASTs (the Git Inspector, planned for a future release, will use diffs)