Running in CI

Non-interactive mode, GitHub Actions, scheduled runs.

Nitpick in CI is headless. No prompts, no stdin. Use --non-interactive for all scheduled or PR-triggered runs.

Non-interactive flags

nitpick run --scope full --non-interactive

What changes:

Scope decision uses inference directly (no confirmation prompt)
Phase 3.5 auto-approves every page
Errors fail the run instead of asking”retry?”

Trade-off: skips the HITL safety nets. Run interactively against new apps first to catch misconfigurations.

GitHub Actions (nightly full regression)

# .github/workflows/qa-nightly.yml
name: QA Nightly

on:
 schedule:
 - cron:"0 2 * * *"
 workflow_dispatch:

jobs:
 qa:
 runs-on: ubuntu-latest
 steps:
 - uses: actions/checkout@v4
 with:
 repository: vaibhav-kindflow/nitpick
 path: nitpick

 - uses: actions/setup-node@v4
 with:
 node-version:"20"

 - name: Install Nitpick
 run: |
 cd nitpick
 npm ci
 npx playwright install --with-deps chromium
 npm link

 - name: Configure
 run: cp .github/nitpick/nitpick.yaml nitpick/nitpick.yaml
 env:
 TEST_ADMIN_USERNAME: ${{ secrets.TEST_ADMIN_USERNAME }}
 TEST_ADMIN_PASSWORD: ${{ secrets.TEST_ADMIN_PASSWORD }}

 - name: Cache KB
 uses: actions/cache@v4
 with:
 path: nitpick/knowledge-base
 key: nitpick-kb-${{ hashFiles('**/nitpick.yaml') }}

 - name: Run full QA
 env:
 ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
 run: |
 cd nitpick
 nitpick run --scope full --non-interactive

 - name: Upload report
 if: always()
 uses: actions/upload-artifact@v4
 with:
 name: qa-report
 path: nitpick/runs/*/reports/
 retention-days: 30

Caching the knowledge base

The KB is what makes second runs fast. Cache it between CI runs:

- uses: actions/cache@v4
 with:
 path: nitpick/knowledge-base
 key: nitpick-kb-${{ hashFiles('**/nitpick.yaml') }}
 restore-keys: |
 nitpick-kb-

First run: crawls from scratch, populates KB. Subsequent runs: restore cache, skip crawl, benefit from known flaky registry + bug dedupe.

Scheduled scope strategy

Frequency	Scope	Use case
On every PR	`iterative --change <PR title>`	Fast feedback per change
Every hour	`smoke`	Cheap, catches pipeline breaks
Every night	`full`	Regression, uses cached KB
Weekly	`full` + KB reset	Validates crawler coverage, catches silent drift

Cost control in CI

Runaway runs are the #1 CI cost risk. Hard-cap:

llm:
 max_iterations: 50 # Safer than default 100 for CI
 max_dollars_per_run: 100 # future release hard abort when exceeded

Alert early: push token usage to your metrics system.