Testing Pipeline¶

Optimize test execution in CI/CD.

Test Strategy¶

Test Pyramid in CI¶

        ┌─────────┐
        │  E2E    │  Slow, run on deploy
        │  Tests  │
       ─┴─────────┴─
      ┌─────────────┐
      │ Integration │  Medium, run on PR
      │   Tests     │
     ─┴─────────────┴─
    ┌─────────────────┐
    │   Unit Tests    │  Fast, run always
    └─────────────────┘

When to Run What¶

Test Type	Trigger	Time Budget
Unit tests	Every push	< 2 min
Integration tests	PR, main	< 10 min
E2E tests	Before deploy	< 20 min
Load tests	Scheduled/manual	30+ min

Parallel Test Execution¶

Split by Directory¶

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4

      - name: Run tests
        run: |
          # Calculate test files for this shard
          all_tests=$(find tests -name "test_*.py" | sort)
          total=$(echo "$all_tests" | wc -l)
          per_shard=$((total / 4 + 1))
          start=$(( (${{ matrix.shard }} - 1) * per_shard + 1 ))

          # Get files for this shard
          tests=$(echo "$all_tests" | sed -n "${start},$((start + per_shard - 1))p")

          pytest $tests

pytest-split Plugin¶

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        group: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4

      - name: Run tests
        run: |
          pip install pytest-split
          pytest --splits 4 --group ${{ matrix.group }}

Test Duration Balancing¶

# First, collect test durations
- name: Run tests with timing
  run: pytest --store-durations

# Upload durations file
- uses: actions/upload-artifact@v4
  with:
    name: test-durations
    path: .test_durations

# Later runs use durations for balancing
- uses: actions/download-artifact@v4
  with:
    name: test-durations

- run: pytest --splits 4 --group ${{ matrix.group }} --durations-path .test_durations

Flaky Test Management¶

Retry Flaky Tests¶

- name: Run tests with retry
  uses: nick-fields/retry@v2
  with:
    timeout_minutes: 10
    max_attempts: 3
    retry_on: error
    command: pytest tests/integration

Quarantine Flaky Tests¶

# conftest.py
import pytest

# Mark flaky tests
@pytest.mark.flaky(reruns=3)
def test_external_api():
    """Test that sometimes fails due to network."""
    pass

# Separate flaky tests
def pytest_configure(config):
    config.addinivalue_line(
        "markers", "flaky: mark test as flaky (deselect with '-m not flaky')"
    )

jobs:
  stable-tests:
    runs-on: ubuntu-latest
    steps:
      - run: pytest -m "not flaky"

  flaky-tests:
    runs-on: ubuntu-latest
    continue-on-error: true  # Don't fail build
    steps:
      - run: pytest -m flaky --reruns 3

Track Flaky Tests¶

# pytest plugin to track flakiness
import json
from datetime import datetime

def pytest_runtest_makereport(item, call):
    if call.excinfo is not None:
        flaky_log = {
            "test": item.nodeid,
            "timestamp": datetime.utcnow().isoformat(),
            "error": str(call.excinfo.value),
        }
        with open("flaky_tests.jsonl", "a") as f:
            f.write(json.dumps(flaky_log) + "\n")

Test Services¶

Database Per Job¶

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_PASSWORD: postgres
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4

      - name: Run tests
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/postgres
        run: pytest

Redis and Other Services¶

services:
  redis:
    image: redis:7-alpine
    ports:
      - 6379:6379
    options: >-
      --health-cmd "redis-cli ping"
      --health-interval 10s
      --health-timeout 5s
      --health-retries 5

  minio:
    image: minio/minio
    ports:
      - 9000:9000
    env:
      MINIO_ROOT_USER: minioadmin
      MINIO_ROOT_PASSWORD: minioadmin

Coverage Reporting¶

Collect Coverage¶

- name: Run tests with coverage
  run: |
    pytest --cov=src --cov-report=xml --cov-report=html

- name: Upload coverage to Codecov
  uses: codecov/codecov-action@v4
  with:
    files: coverage.xml
    fail_ci_if_error: true

- name: Upload coverage artifact
  uses: actions/upload-artifact@v4
  with:
    name: coverage-report
    path: htmlcov/

Coverage Gate¶

- name: Check coverage threshold
  run: |
    coverage report --fail-under=80

Merge Coverage from Parallel Jobs¶

jobs:
  test:
    strategy:
      matrix:
        shard: [1, 2, 3]
    steps:
      - run: pytest --cov=src --cov-report=xml:coverage-${{ matrix.shard }}.xml

      - uses: actions/upload-artifact@v4
        with:
          name: coverage-${{ matrix.shard }}
          path: coverage-${{ matrix.shard }}.xml

  coverage:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/download-artifact@v4
        with:
          pattern: coverage-*
          merge-multiple: true

      - name: Merge coverage
        run: |
          pip install coverage
          coverage combine coverage-*.xml
          coverage xml -o coverage.xml

      - uses: codecov/codecov-action@v4

Test Result Reporting¶

JUnit XML Reports¶

- name: Run tests
  run: pytest --junitxml=test-results.xml

- name: Publish test results
  uses: mikepenz/action-junit-report@v4
  if: always()
  with:
    report_paths: test-results.xml
    fail_on_failure: true

Test Summary¶

- name: Test Summary
  if: always()
  uses: test-summary/action@v2
  with:
    paths: test-results.xml

Frontend Testing¶

Vitest¶

frontend-test:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4

    - uses: oven-sh/setup-bun@v2

    - uses: actions/setup-node@v4
      with:
        node-version: '20'

    - run: bun install
    - run: bun run test:ci

    - name: Upload coverage
      uses: codecov/codecov-action@v4
      with:
        files: coverage/coverage-final.json

Playwright E2E¶

e2e-test:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4

    - uses: actions/setup-node@v4
      with:
        node-version: '20'

    - name: Install Playwright
      run: |
        bun install
        bunx playwright install --with-deps

    - name: Run E2E tests
      run: bunx playwright test

    - name: Upload report
      uses: actions/upload-artifact@v4
      if: always()
      with:
        name: playwright-report
        path: playwright-report/

Playwright Sharding¶

e2e-test:
  runs-on: ubuntu-latest
  strategy:
    matrix:
      shard: [1, 2, 3, 4]
  steps:
    - name: Run Playwright
      run: bunx playwright test --shard=${{ matrix.shard }}/4

Test Optimization¶

Only Run Affected Tests¶

- name: Get changed files
  id: changed
  uses: tj-actions/changed-files@v42
  with:
    files: |
      src/**/*.py
      tests/**/*.py

- name: Run affected tests
  if: steps.changed.outputs.any_changed == 'true'
  run: |
    # Find tests related to changed files
    changed_files="${{ steps.changed.outputs.all_changed_files }}"
    pytest --collect-only -q | grep -F "$changed_files" | xargs pytest

Skip Tests on Documentation Changes¶

on:
  push:
    paths-ignore:
      - '**.md'
      - 'docs/**'
      - '.github/ISSUE_TEMPLATE/**'

Cache Test Database¶

- name: Cache test database
  uses: actions/cache@v4
  with:
    path: .test-db
    key: test-db-${{ hashFiles('**/migrations/**') }}

- name: Setup test database
  run: |
    if [ ! -f .test-db/initialized ]; then
      alembic upgrade head
      python scripts/seed_test_data.py
      touch .test-db/initialized
    fi

Complete Testing Workflow¶

name: Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.11", "3.12"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
          cache: pip
      - run: pip install -e ".[test]"
      - run: pytest tests/unit --cov=src -n auto

  integration-tests:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_PASSWORD: postgres
        ports:
          - 5432:5432
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: pip
      - run: pip install -e ".[test]"
      - run: pytest tests/integration
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/postgres

  e2e-tests:
    needs: [unit-tests, integration-tests]
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: bun install
      - run: bunx playwright install --with-deps
      - run: bunx playwright test

Best Practices Summary¶

Practice	Benefit
Parallelize tests	Faster feedback
Use test services	Consistent environment
Track flaky tests	Improve reliability
Cache dependencies	Reduce build time
Report coverage	Maintain quality
Run affected tests	Save resources
Shard E2E tests	Speed up slow tests