Architecture

How Nitpick's pieces fit together — components, data flow, MCP surface, extension points.

Design principles

  1. Single-process agent. All orchestration runs in one Node.js process.
  2. Bundled skills. prd-e2e-orchestrator, qa-team (20 skills), and multi-role-e2e ship in skills/.
  3. Pluggable providers. Anthropic, OpenAI, Claude Code all implement the same Provider interface.
  4. Dual MCP surface. The KB is reachable two ways: in-process kb_* function calls for Anthropic/OpenAI loops; stdio MCP server for Claude Code, Claude Desktop, Cursor, Cline, and any MCP client.
  5. LLM-backed routing, deterministic execution. Scope inference and exploration use the LLM. Crawler, smoker, and KB are pure TypeScript.
  6. Persistent knowledge. Every run writes to knowledge-base/. Every next run benefits.
  7. HITL at decision points, not work items. Scope confirmation (1 decision), page understanding (1 per page). Skippable in non-interactive CI mode.

Component map

src/
├── cli/
│   ├── index.ts          CLI entry (commander) — init, run, crawl, mcp, skill, serve, prep
│   └── init.ts           `nitpick init` wizard
├── server/
│   └── index.ts          HTTP API (Express) — async jobs, polling, HITL input
├── core/
│   ├── orchestrator.ts   Senior QA phases, routing, honesty gate, error handling
│   ├── scope-decider.ts  Iterative-vs-full branching + HITL UI
│   ├── scope-inferrer.ts LLM-backed "what changed → which pages"
│   └── workflow.ts       Phase state machine + persistence
├── agent/
│   ├── runtime.ts        Agent task runner — provider + skill + tools + KB context
│   ├── skill-loader.ts   Reads SKILL.md + flow.md
│   └── tools.ts          bash, read_file, write_file, list_files, ask_user, kb_* (8 tools)
├── mcp/
│   ├── kb-server.ts      MCP stdio server (5 read + 3 write tools via McpServer)
│   ├── kb-runtime-tools.ts  In-process kb_* tool definitions + executeKbTool() dispatcher
│   └── trust.ts          TrustTier enum: read-write | read-only | deny
├── providers/
│   ├── provider.ts           Interface + registry
│   ├── anthropic-provider.ts @anthropic-ai/sdk + prompt caching
│   ├── openai-provider.ts    openai SDK + Skills (Responses API) + parallel tool calls
│   ├── claude-code-provider.ts subprocess wrapper + MCP config injection
│   ├── retry.ts              Exponential backoff for 429/5xx/network
│   └── index.ts              Factory
├── agents/
│   ├── crawler.ts            SPA-aware page discovery
│   ├── crawler-llm-assist.ts LLM-driven crawler (SPA escape hatch)
│   ├── page-tester.ts        Wraps delegatePageTest
│   ├── flow-tester.ts        Wraps delegateFlowTest
│   ├── smoker.ts             Fast regression check
│   └── reporter.ts           Aggregates into unified HTML/MD/JSON
├── claude/
│   └── delegate.ts       Builds skill tasks, calls runAgentTask, headless directive
├── kb/
│   ├── manager.ts        KnowledgeBase class
│   └── diff-engine.ts    Model diffing
└── utils/
    ├── config.ts         YAML loader + env resolution
    ├── types.ts          Shared Zod schemas + TS types (McpConfigSchema, CrawlConfigSchema)
    ├── logger.ts         pino
    ├── fs-utils.ts       small helpers
    └── prompts.ts        readline-based HITL prompts

MCP architecture

Nitpick exposes its knowledge base over two parallel surfaces — same KnowledgeBase class underneath both.

External MCP clients                  Internal agent loops
(Claude Code, Cursor, Cline)          (Anthropic, OpenAI providers)

    stdio JSON-RPC                    in-process function calls
         │                                     │
    kb-server.ts                    kb-runtime-tools.ts
         │                                     │
         └──────────── KnowledgeBase ──────────┘
                        kb/manager.ts
                       ./knowledge-base/

External surface (nitpick mcp serve): McpServer registers 8 tools on a StdioServerTransport. Trust tier is read from config; tools that require write trust are only registered when trust: read-write.

Internal surface (agent loops): defaultTools({ with_kb: true }) adds the same 8 tools to the agent’s tool list as standard function call definitions. executeTool() routes kb_* calls to executeKbTool(). Used by Anthropic and OpenAI providers (not Claude Code, which gets stdio MCP).

Claude Code provider: writes a temp mcp.json to os.tmpdir() with the KB server entry pointing to an absolute kb_path (relative paths would resolve to the run subdir, not the project root). Passes --mcp-config to the claude -p subprocess.

Data flow: a typical run

  1. User sends a JobRequest via CLI/HTTP/skill
  2. Senior QA checks KB for page graph: if missing, runs Crawler (deterministic or LLM-assist)
  3. Senior QA decides scope: LLM-backed inference for iterative; direct for others
  4. User confirms scope (HITL checkpoint — skipped in non-interactive mode)
  5. For each page: runs Junior QA via runAgentTaskProvider.runTask → tool-use loop. All kb_* calls go directly to the KB (in-process) or via MCP (Claude Code).
  6. Phase 3.5 kickoff: Junior QA presents page model summary; user approves or edits. In non-interactive mode, agent self-writes approved.json and proceeds.
  7. For each flow: runs Flow Lead via same pathway
  8. Smoker runs on critical-not-in-scope pages
  9. Reporter aggregates into unified HTML/MD/JSON
  10. KB is updated: new models archived, flaky registry updated, bugs deduped, kb_* write tools record findings

State files

Every run persists to runs/<run_id>/:

runs/<run_id>/
├── job-state.json           Workflow state, errors
├── scope.json               Confirmed pages + flows
├── pages/<page_id>/
│   ├── agent-trace.jsonl    Every tool call
│   ├── agent-transcript.md  Readable transcript
│   ├── derived-ui-model/
│   │   ├── evidence.json    Honesty gate input
│   │   └── model.json       Derived UI Model
│   ├── approved.json        Phase 3.5 approval (HITL or headless)
│   ├── tests/
│   └── reports/
├── flows/<flow_name>/
│   └── (same structure)
├── smoke/smoke-results.json
└── reports/
    ├── unified-report.html
    ├── summary.md
    └── data.json

Extension points

New LLM provider

  1. Implement Provider in src/providers/<name>-provider.ts
  2. Register in src/providers/index.ts
  3. Add to the llm.provider union in src/utils/types.ts

New MCP tool

  1. Define in src/mcp/kb-runtime-tools.ts (in-process) and src/mcp/kb-server.ts (stdio)
  2. Add to isKbTool() predicate
  3. Add the tool schema to defaultTools() in src/agent/tools.ts

New agent

  1. Create src/agents/<agent>.ts with run<Agent>(opts)
  2. Call from orchestrator.ts
  3. Add results to UnifiedReport in types.ts

Non-goals

  • Not reimplementing Playwright
  • Not owning retry classification (the skill’s Phase 5 handles that)
  • Not hosting LLMs
  • Not parsing source ASTs (the Git Inspector, planned for a future release, will use diffs)