Roadmap

What's live today, what's coming next, and what's on the horizon.

Live now (v2.1)

Core runtime

SPA-aware crawler with ARIA inventory, menu expansion, and click-and-observe
LLM-assist crawler fallback (--strategy agent) for apps the deterministic crawler under-discovers
Senior QA orchestrator with LLM-backed scope routing
Junior QA subagents running in parallel, one per page or flow
Human-in-the-loop scope confirmation and Phase 3.5 kickoff approval
Non-interactive CI mode (--non-interactive) so the agent self-approves instead of blocking on stdin
Multi-role flow testing
Resume interrupted runs (--resume <id>)
Async HTTP jobs with on-disk state polling

MCP knowledge base

nitpick mcp serve exposes the KB as an MCP stdio server
Works with Claude Code, Claude Desktop, Cursor, Cline, and any MCP-aware client
8 tools: five read-only, three write (gated by trust tier in config)
All three providers consume the KB natively: in-process function calls for Anthropic and OpenAI, MCP config injection for Claude Code subprocess

QA team skills

20 skills covering every QA role: scope, explore, flow, triage, release-gate, kickoff, prioritize, write-tests, run, resume, classify-failure, investigate-flake, smoke, canary, coverage-audit, weekly-report, roadmap, file-bug, fixtures, careful
Each skill has numbered phases, precondition gates, a KB context table, failure modes, and a worked example

Provider integrations

Anthropic with prompt caching (roughly 70% cost reduction on warm runs)
OpenAI with Skills via Responses API and parallel tool calls
Claude Code subprocess with MCP config injection
Unified retry: exponential backoff, jittered, 5 attempts
HTML, Markdown, and JSON reports

Coming next (v2.2 through v2.4)

v2.2 — Auto-register

nitpick mcp register detects which AI hosts you have installed (Claude Code, Claude Desktop, Cursor, Codex) and writes the nitpick-kb MCP entry into each config. No manual JSON editing. For Claude Code it also symlinks the qa-team skills into ~/.claude/skills/. Safe to re-run.

v2.3 — Team mode

nitpick init --team writes a .claude/ directory into your application repo with the MCP server entry and a skills symlink. Anyone on the team who opens the repo in Claude Code gets the KB and skills without doing any setup.

v2.4 — Host packs

host-packs/codex/ adds an AGENTS.md dispatcher and Codex-format skills routing to --provider openai. host-packs/cursor/ adds a .cursor/mcp.json and Cursor-adapted skill variants. Extends MCP-native reach to Codex CLI and Cursor.

On the horizon

GitHub App — trigger runs on every pull request, map the diff to affected pages using the page graph.

Slack integration — post results to a channel, trigger runs via a slash command.

Bug filer — Linear and GitHub Issues integration with deduplication on page, test name, and error signature.

Budget caps — llm.max_dollars_per_run aborts before overspending, with per-page token and dollar breakdowns in the HTML report.

Head-of-QA agent — weekly health summaries, trend detection, and coverage gap reporting.

Visual regression — screenshot baseline diffing per page.

Hosted version — managed runtime for teams that prefer not to self-host.

What won’t change

You bring your own API key. Nitpick does not host LLMs.
Generated tests are standard Playwright. No proprietary runtime.
The Derived UI Model is the authoring surface. No DSL.
All features work across providers. No lock-in.

Feedback

Open a GitHub issue with the feature-request label. github.com/vaibhav-kindflow/nitpick