Skills reference
Every qa-team skill — what to type, what it needs, what it does.
Each skill is invoked by typing a command in Claude Code, Cursor, or Codex. This page covers what to type, what context each skill needs, and what comes out.
Workflows at a glance
Fresh URL — first time:
nitpick init → nitpick crawl → /qa-explore <url> → /qa-kickoff → /qa-write-tests → /qa-run
Daily — after the app is known:
/qa-scope <what changed> → /qa-run → /qa-triage → /qa-smoke → /qa-release-gate
/qa-scope
Plan which pages to test. Does not run any tests.
What to type:
/qa-scope I refactored the task creation form to use a new API endpoint
/qa-scope the billing page got a new payment provider
/qa-scope ← (no description — will ask you)
Needs:
nitpick.yamlin the current directoryknowledge-base/page-graph.json(runnitpick crawlfirst)- A description of what changed (from your message, a PR description, or git context)
Output: A ranked table of affected pages with confidence scores and reasons. No tests run. Confirm the scope — /qa-run will pick it up.
/qa-run
Execute tests. The main workhorse.
What to type:
/qa-run ← uses scope from /qa-scope
/qa-run /tasks/new ← targeted: one specific page
/qa-run /tasks/new /billing ← targeted: multiple pages
/qa-run full regression ← all pages in the KB
/qa-run smoke ← critical pages only, no LLM
Needs:
nitpick.yamlin the current directory- API key set (
ANTHROPIC_API_KEYorOPENAI_API_KEY) orclaudeCLI installed knowledge-base/page-graph.json(runnitpick crawlfirst)- Auth state at
knowledge-base/auth/(created bynitpick crawl) - For iterative: a change description (from
/qa-scopeor typed inline) - For targeted: at least one URL path
What runs behind the scenes (per page): Pre-flight → Phase 0 (parse URL, ask terminal guards) → Phase 2 (Playwright exploration) → Phase 3 (build UI model) → Phase 3.5 (your approval) → Phase 4 (generate tests) → Phase 5 (run + classify failures) → Phase 6 (write report + KB)
On second and later runs for the same page: Phase 2 starts from the existing model, Phase 3.5 auto-skips already-answered questions. Runs meaningfully faster.
Output: Unified pass/fail report. Bugs filed to KB. Chains to /qa-triage if failures exist.
/qa-explore
Build or refresh the UI model for one page. No tests generated or run.
What to type:
/qa-explore /dashboard
/qa-explore /tasks/new
Needs:
nitpick.yamlin the current directory- Auth state for the role that can see this page
- The page must be reachable from
base_url
Use when:
- Testing a page for the first time before committing to a full run
- The UI was rewritten and the existing model is stale
- You want to inspect what nitpick knows before generating tests
Output: Derived UI Model written to knowledge-base/pages/<page_id>/. Surfaces diff vs. prior model if one exists.
/qa-kickoff
Review what nitpick learned about a page and give approval before tests are written.
What to type:
/qa-kickoff /tasks/new
/qa-kickoff ← lists pages with unapproved models, lets you pick
Needs:
- A page model already built (run
/qa-exploreor/qa-runto Phase 3 first) - You in the conversation — this is the human-in-the-loop gate
What it shows:
- Every field, button, and validation rule it found
- Conditional relationships it detected
- Questions it’s unsure about (ambiguous labels, dynamic content)
- What it plans to test
You can add:
- Business rules it couldn’t discover (“email fields reject + signs”)
- Terminal guards (“never click the Delete Account button”)
- Focus areas (“prioritise the payment flow”)
- Corrections to anything it got wrong
Your answers are persisted — next run auto-skips questions whose evidence hasn’t changed.
Output: Approval written to phase-outputs/page-understanding/<page_id>.approved.json. Required before /qa-write-tests.
/qa-write-tests
Generate Playwright tests from an approved model. Does not run them.
What to type:
/qa-write-tests /tasks/new
/qa-write-tests /billing
Needs:
- Approved model at
knowledge-base/pages/<page_id>/model.latest.json - Approval flag (
approved: true) atphase-outputs/page-understanding/<page_id>.approved.json - Run
/qa-explorethen/qa-kickofffirst if you haven’t already
Use when:
- You want to inspect the spec before running it
- You updated the model and need to regenerate tests
- CI: generate specs offline, commit them, run separately
Output: tests/<page_id>.spec.ts written. Coverage breakdown shown. Does not execute.
/qa-smoke
Instant health check. No LLM. Zero cost.
What to type:
/qa-smoke
Needs:
scope.critical_pagespopulated innitpick.yaml- Auth state for each role that accesses critical pages
- Tests for critical pages already generated (run
/qa-runon them first)
What it checks: Each critical page renders (heading visible, no auth redirect, no 5xx). Runs headless Chromium in parallel.
Use when: Before every merge. After every deploy. Any time you want a 5–15 second pulse check.
Output: Pass / fail per page. No report written. Chains to /qa-release-gate if all green.
/qa-triage
Walk the failures from the last run and get an action list.
What to type:
/qa-triage ← latest run
/qa-triage 2026-05-11T08-32 ← specific run ID
Needs:
- A completed run at
runs/<run_id>/reports/data.json - KB for deduplication lookups
What it does: Classifies each failure as timing flake, selector drift, data collision, or real app bug. Deduplicates against known open bugs. Scores severity. Tells you what to fix vs what to ignore.
Output: Prioritised action list. Known bugs marked. New bugs get a /qa-file-bug recommendation.
/qa-classify-failure
Deep-dive into one specific test failure.
What to type:
/qa-classify-failure /tasks/new:should submit valid task
/qa-classify-failure ← picks the latest unclassified failure
Needs:
- Run dir with test artifacts
- The test name or page ID to drill into
- KB for flake history
Output: Four-class verdict (timing / selector_drift / data_collision / app_bug) with confidence score and recommended fix path.
/qa-investigate-flake
Find the root cause of a test that keeps failing intermittently.
What to type:
/qa-investigate-flake /tasks/new
/qa-investigate-flake /tasks/new:should submit valid task
Needs:
- Flake history in the KB (builds up over multiple runs)
- Run history in
runs/
What it analyses: Failure rate over recent runs, time-of-day clustering, provider correlation, model-version correlation.
Output: Root cause hypothesis. Recommended fix (timing threshold, selector update, test isolation, or flag as known flake).
/qa-release-gate
Go / no-go before shipping.
What to type:
/qa-release-gate
Needs:
scope.critical_pagesinnitpick.yaml- Auth state for critical pages
- KB with bug and flake state
What it checks: Runs smoke on critical pages, reads open critical bugs from KB, checks recent flake rate. Returns a verdict.
Output:
PASS— safe to shipPASS WITH WARNINGS— minor issues, your callBLOCK— open critical bugs or smoke failures, do not ship
/qa-canary
Post-deploy regression check against the last passing baseline.
What to type:
/qa-canary
Needs:
- A clean baseline run before the deploy (the most recent passing smoke run)
- Auth state for critical pages
- Deploy accessible at
base_url
Use when: Just after a deploy lands. Catches regressions the pre-deploy tests didn’t — environment differences, config changes, migration side-effects.
Output: Comparison against baseline. Regressions filed as bugs. Pass / fail verdict.
/qa-flow
Test a multi-actor flow end to end.
What to type:
/qa-flow hiring
/qa-flow ← lists configured flows, lets you pick
Needs:
- A flow defined in
nitpick.yamlunderflows: - Auth state for every role involved in the flow
- Page models for all pages in the flow (run
/qa-runon them first)
What it tests: Each stage in sequence, with cross-actor assertions — e.g. admin creates a record, candidate sees it, admin approves it. Handoff bugs (stage passes but the next actor can’t proceed) are flagged as the highest priority failure type.
Output: Per-stage pass/fail. Cross-actor assertion results. Handoff bugs filed to KB.
/qa-file-bug
Persist a bug to the knowledge base.
What to type:
/qa-file-bug ← interactive, asks for details
/qa-file-bug /tasks/new ← pre-fills the page
Needs:
- KB write access (
mcp.kb.trust: read-writeinnitpick.yaml) - Bug details: title, description, severity, steps to reproduce, expected vs actual
Use when: /qa-classify-failure returned an app_bug verdict. Or you found a bug manually and want it tracked.
Output: Bug written to KB with stable deduplication hash. Duplicate check run first — won’t create a duplicate if the same bug is already open.
/qa-fixtures
Manage login auth state files.
What to type:
/qa-fixtures
Needs:
- Roles configured in
nitpick.yamlwithusername_envandpassword_env - Credential env vars set in
.env
Use when:
- A run mass-failed and every page redirected to login (auth expired)
- You added a new role to
nitpick.yaml - You changed credentials
What it does: Lists existing auth state files, validates each one with a probe, refreshes expired ones by re-running the login flow.
/qa-coverage-audit
See what’s tested, what’s stale, and what’s missing.
What to type:
/qa-coverage-audit
Needs:
knowledge-base/page-graph.json- KB with run history
Output: Every page bucketed as: healthy, stale (not tested recently), failing, untested, or model-less. Per-role coverage breakdown. Remedial actions ranked by severity.
/qa-prioritize
Rank a list of pages by risk when you can’t test everything.
What to type:
/qa-prioritize ← ranks all pages in scope
/qa-prioritize /tasks/new /billing /dashboard
/qa-prioritize --top 5 ← top 5 only
Needs:
- KB with run history (flake rate, bug history, staleness)
nitpick.yamlwithcritical_pages
Scoring factors: Critical flag, open bugs, flake rate, time since last test, recent code changes.
Output: Ordered list with score and reason per page. Recommended chain commands.
/qa-resume
Continue a run that was interrupted.
What to type:
/qa-resume ← resumes latest interrupted run
/qa-resume 2026-05-11T08-32 ← specific run ID
Needs:
- An interrupted run in
runs/withjob-state.json - The same project dir and auth state
What it skips: Pages already marked passed. Zero LLM cost for re-skipped pages.
/qa-weekly-report
7-day trend digest.
What to type:
/qa-weekly-report
/qa-weekly-report --days 14 ← two-week window
Needs:
- At least a few runs in
runs/within the window - KB with bug history
Output: Pass rate trend, flake rate trend, bug flow (opened vs closed), cost trend, pages ranked by volatility. Saved to disk for sharing.
/qa-careful
Safety wrapper for testing in sensitive environments.
What to type:
/qa-careful /dashboard ← smoke this page with extra guards
Needs:
nitpick.yamlwithproject.environment_tagset, or a production URL
What it changes: Expands the terminal-action denylist, downgrades KB to read-only, asks explicit confirmation before any page that matches a production URL pattern. Scope is limited to smoke / iterative / targeted — never full.
/qa-roadmap
Strategic 30/60/90-day QA plan.
What to type:
/qa-roadmap
Needs:
- KB with at least a few weeks of run history
scope.critical_pagesand ideallyflows[]configured
Output: Config gaps, coverage gaps, and a 30/60/90-day action plan bucketed by horizon. Saved to disk.