Human-in-the-loop

Where and why Nitpick pauses for your input, and how to run non-interactively.

Nitpick has two mandatory human-in-the-loop checkpoints. They exist because certain decisions are cheap for you and expensive for an agent to guess.

Checkpoint 1: Scope confirmation

When you run:

nitpick run --scope iterative --change"I modified the task form"

Senior QA proposes which pages it thinks are affected and asks you to confirm:

─── Change impact proposal ──────────────────────
Based on your description, I think the change affects:
 1. tasks-new (high) "Direct target"
 2. tasks-edit (medium) "Same form component likely reused"

Is this correct?
 > 1. Yes, proceed with this scope
 2. Add more pages
 3. Remove some pages
 4. Replace with a specific list
 5. Switch to full testing

Why: the agent’s inference is good but not perfect. You know the change intimately. You can correct in seconds; the agent can’t catch an omission in hours.

Skipped for --scope full, --scope smoke, and --scope targeted: those are unambiguous.

Checkpoint 2: Page Understanding approval (Phase 3.5)

Before generating tests for a page, Junior QA presents what it learned. You approve, add context (“email must reject + signs backend limit”), or reject.

Your context gets merged into the Derived UI Model as user_business_rules[], user_test_data_hints[], and user_focus_areas[].

Non-interactive mode

For CI and scheduled runs:

nitpick run --scope full --non-interactive

What changes:

  • Scope decision uses the agent’s inference without prompting
  • Phase 3.5 auto-approves (treats the TL;DR as approved)
  • Errors fail loudly instead of asking”do you want to retry?”

Trade-off: non-interactive mode trusts the agent more. If you’re running against a new app or major refactor, stay interactive for the first run.

Terminal guards: the upfront prompt

Before any exploration, Nitpick asks:

“Before I start exploring, are there any actions on this page I should NEVER click? Examples: ‘Delete account’, ‘Submit final application’, ‘Cancel subscription’.”

You answer once. The list is stored, reused across runs, and enforced in every phase.

Why HITL at all?

Two reasons:

  1. Catastrophic consequence asymmetry. If the agent misses a destructive action, damage is real. If you confirm in 3 seconds, damage is zero.
  2. Domain knowledge transfer. The agent cannot read your Notion, your Slack, or your head. Phase 3.5 is the moment you inject business rules into the model, and those rules persist.

You can make Nitpick fully autonomous. You probably shouldn’t for critical production apps.