CI¶
In this guide: run your LLM-touching tests on every push without an API key and without hitting the network.
The core idea¶
Once a cassette exists in your repo, CI should refuse to make real API calls. pytest --reel-mode replay enforces that: a missing entry becomes a loud test failure instead of a silent (and billable) upstream call.
You commit cassettes alongside the tests that produced them. CI never needs an API key.
GitHub Actions¶
# .github/workflows/test.yml
name: Test
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
enable-cache: true
- name: Set up Python
run: uv python install 3.12
- name: Install dependencies
run: uv sync --frozen
- name: Run tests (replay only — no network)
run: uv run pytest --reel-mode replay
No OPENAI_API_KEY, no ANTHROPIC_API_KEY. The proxy serves every recorded call from disk.
GitLab CI¶
# .gitlab-ci.yml
test:
image: python:3.12
before_script:
- pip install uv
- uv sync --frozen
script:
- uv run pytest --reel-mode replay
CircleCI¶
# .circleci/config.yml
version: 2.1
jobs:
test:
docker:
- image: cimg/python:3.12
steps:
- checkout
- run: pip install uv && uv sync --frozen
- run: uv run pytest --reel-mode replay
workflows:
test:
jobs:
- test
Cassettes in git¶
Commit cassettes the same way you commit fixtures:
tests/
├── cassettes/
│ ├── test_chat/
│ │ ├── test_summarize.jsonl
│ │ └── test_translate.jsonl
│ └── test_tools/
│ └── test_function_calling.jsonl
└── test_chat.py
The repo-local pre-commit hook (hooks/pre-commit-cassette-check.py) refuses any *.jsonl containing a detectable secret pattern. Enable it once:
See Redaction for what gets scrubbed and how.
When a cassette miss is correct behaviour¶
You changed a prompt. The captured request no longer matches what your code now sends. The replay misses and the test fails — exactly right. Re-record:
Then commit the updated cassette. CI replays the new bytes.
Tightening further¶
- Run
make checkin CI — lint + types + tests, all replay-only. - Scan cassettes for secrets as a separate CI step using the same hook script:
- Pin upstream URLs with
--upstreamif you front the API with a custom gateway.