Roadmap
What's live today, what's coming next, and what's on the horizon.
Live now (v2.1)
Core runtime
- SPA-aware crawler with ARIA inventory, menu expansion, and click-and-observe
- LLM-assist crawler fallback (
--strategy agent) for apps the deterministic crawler under-discovers - Senior QA orchestrator with LLM-backed scope routing
- Junior QA subagents running in parallel, one per page or flow
- Human-in-the-loop scope confirmation and Phase 3.5 kickoff approval
- Non-interactive CI mode (
--non-interactive) so the agent self-approves instead of blocking on stdin - Multi-role flow testing
- Resume interrupted runs (
--resume <id>) - Async HTTP jobs with on-disk state polling
MCP knowledge base
nitpick mcp serveexposes the KB as an MCP stdio server- Works with Claude Code, Claude Desktop, Cursor, Cline, and any MCP-aware client
- 8 tools: five read-only, three write (gated by trust tier in config)
- All three providers consume the KB natively: in-process function calls for Anthropic and OpenAI, MCP config injection for Claude Code subprocess
QA team skills
- 20 skills covering every QA role: scope, explore, flow, triage, release-gate, kickoff, prioritize, write-tests, run, resume, classify-failure, investigate-flake, smoke, canary, coverage-audit, weekly-report, roadmap, file-bug, fixtures, careful
- Each skill has numbered phases, precondition gates, a KB context table, failure modes, and a worked example
Provider integrations
- Anthropic with prompt caching (roughly 70% cost reduction on warm runs)
- OpenAI with Skills via Responses API and parallel tool calls
- Claude Code subprocess with MCP config injection
- Unified retry: exponential backoff, jittered, 5 attempts
- HTML, Markdown, and JSON reports
Coming next (v2.2 through v2.4)
v2.2 — Auto-register
nitpick mcp register detects which AI hosts you have installed (Claude Code, Claude Desktop, Cursor, Codex) and writes the nitpick-kb MCP entry into each config. No manual JSON editing. For Claude Code it also symlinks the qa-team skills into ~/.claude/skills/. Safe to re-run.
v2.3 — Team mode
nitpick init --team writes a .claude/ directory into your application repo with the MCP server entry and a skills symlink. Anyone on the team who opens the repo in Claude Code gets the KB and skills without doing any setup.
v2.4 — Host packs
host-packs/codex/ adds an AGENTS.md dispatcher and Codex-format skills routing to --provider openai. host-packs/cursor/ adds a .cursor/mcp.json and Cursor-adapted skill variants. Extends MCP-native reach to Codex CLI and Cursor.
On the horizon
GitHub App — trigger runs on every pull request, map the diff to affected pages using the page graph.
Slack integration — post results to a channel, trigger runs via a slash command.
Bug filer — Linear and GitHub Issues integration with deduplication on page, test name, and error signature.
Budget caps — llm.max_dollars_per_run aborts before overspending, with per-page token and dollar breakdowns in the HTML report.
Head-of-QA agent — weekly health summaries, trend detection, and coverage gap reporting.
Visual regression — screenshot baseline diffing per page.
Hosted version — managed runtime for teams that prefer not to self-host.
What won’t change
- You bring your own API key. Nitpick does not host LLMs.
- Generated tests are standard Playwright. No proprietary runtime.
- The Derived UI Model is the authoring surface. No DSL.
- All features work across providers. No lock-in.
Feedback
Open a GitHub issue with the feature-request label. github.com/vaibhav-kindflow/nitpick