Running in CI
Non-interactive mode, GitHub Actions, scheduled runs.
Nitpick in CI is headless. No prompts, no stdin. Use --non-interactive for all scheduled or PR-triggered runs.
Non-interactive flags
nitpick run --scope full --non-interactive
What changes:
- Scope decision uses inference directly (no confirmation prompt)
- Phase 3.5 auto-approves every page
- Errors fail the run instead of asking”retry?”
Trade-off: skips the HITL safety nets. Run interactively against new apps first to catch misconfigurations.
GitHub Actions (nightly full regression)
# .github/workflows/qa-nightly.yml
name: QA Nightly
on:
schedule:
- cron:"0 2 * * *"
workflow_dispatch:
jobs:
qa:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
repository: vaibhav-kindflow/nitpick
path: nitpick
- uses: actions/setup-node@v4
with:
node-version:"20"
- name: Install Nitpick
run: |
cd nitpick
npm ci
npx playwright install --with-deps chromium
npm link
- name: Configure
run: cp .github/nitpick/nitpick.yaml nitpick/nitpick.yaml
env:
TEST_ADMIN_USERNAME: ${{ secrets.TEST_ADMIN_USERNAME }}
TEST_ADMIN_PASSWORD: ${{ secrets.TEST_ADMIN_PASSWORD }}
- name: Cache KB
uses: actions/cache@v4
with:
path: nitpick/knowledge-base
key: nitpick-kb-${{ hashFiles('**/nitpick.yaml') }}
- name: Run full QA
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
cd nitpick
nitpick run --scope full --non-interactive
- name: Upload report
if: always()
uses: actions/upload-artifact@v4
with:
name: qa-report
path: nitpick/runs/*/reports/
retention-days: 30
Caching the knowledge base
The KB is what makes second runs fast. Cache it between CI runs:
- uses: actions/cache@v4
with:
path: nitpick/knowledge-base
key: nitpick-kb-${{ hashFiles('**/nitpick.yaml') }}
restore-keys: |
nitpick-kb-
First run: crawls from scratch, populates KB. Subsequent runs: restore cache, skip crawl, benefit from known flaky registry + bug dedupe.
Scheduled scope strategy
| Frequency | Scope | Use case |
|---|---|---|
| On every PR | iterative --change <PR title> | Fast feedback per change |
| Every hour | smoke | Cheap, catches pipeline breaks |
| Every night | full | Regression, uses cached KB |
| Weekly | full + KB reset | Validates crawler coverage, catches silent drift |
Cost control in CI
Runaway runs are the #1 CI cost risk. Hard-cap:
llm:
max_iterations: 50 # Safer than default 100 for CI
max_dollars_per_run: 100 # future release hard abort when exceeded
Alert early: push token usage to your metrics system.