Skip to content

Generation Pipeline

This document describes how generation requests flow through the Sartiq platform, from user configuration through AI processing to final delivery.


Overview

The generation pipeline transforms creative inputs into professional imagery:

flowchart TB
    subgraph Inputs ["Inputs"]
        direction LR
        P[Product] ~~~ S[Subject]
        ST[Style] ~~~ G[Guidelines]
    end

    subgraph Pipeline ["Pipeline"]
        direction LR
        C[Configure] --> Q[Queue] --> O[Orchestrate]
        O --> AI[Generate] --> PP[Post-Process]
    end

    subgraph Output ["Output"]
        direction LR
        I[Image] ~~~ M[Metadata]
    end

    Inputs --> Pipeline --> Output

Key concepts:

Concept Description
Generation A task that produces one or more images
Prediction A single image output from a generation
Strategy The type of task (generate, edit, refine, video, etc.)
Task A unit of work sent to the Compute Server

Generation Types

Sartiq supports multiple generation types, each optimized for different use cases:

flowchart TB
    subgraph Standard ["Standard Generations"]
        BASE[Base Generation]
        AGENTIC[Agentic Generation]
    end

    subgraph Specialized ["Specialized Tasks"]
        EDIT[Editing]
        REFINE[Refinement]
        VIDEO[Video]
    end

    subgraph Enhancement ["Enhancement Tasks"]
        BG[Background Fix]
        FACE[Face Fixer]
        ADJ[Image Adjuster]
    end
Type Purpose When to Use
Base Standard image generation Most on-model imagery
Agentic Generation with quality evaluation and retries High-stakes outputs
Editing Garment placement and compositing Product swaps, try-on
Refine Upscaling and detail enhancement Final output preparation
Video Generate video from images Motion content
Background Fix Fix or replace backgrounds Post-processing
Face Fixer Enhance facial details Portrait refinement
Image Adjuster General image adjustments Color/exposure fixes

For detailed information on each type, see Generation Types.


High-Level Flow

sequenceDiagram
    participant U as User
    participant W as Webapp
    participant B as Backend
    participant R as Redis
    participant C as Compute
    participant AI

    rect rgb(240, 248, 255)
        Note over U,B: 1 · Configure
        U->>W: Set generation options
        W->>B: POST /generations
        B->>B: Create Generation + Predictions
        B-->>W: Generation ID
    end

    rect rgb(240, 255, 240)
        Note over B,AI: 2 · Process
        B->>B: Select Strategy
        B->>C: Submit Task(s)
        C-->>B: Task ID(s)
        C->>AI: Execute generation
        AI-->>C: Result images
        C->>C: Convert to WebP + embed ICC (Display P3)
        C->>C: Save to compute/{type}s/{task_id}/
        C->>R: Emit task.completed
    end

    rect rgb(255, 250, 240)
        Note over U,C: 3 · Complete
        R-->>B: Event received
        B->>C: GET /tasks/{id}/result
        B->>B: Copy from compute/ to images/generations/
        B->>B: Update Predictions
        B-->>W: WebSocket update
        W-->>U: Display results
    end

Architecture

Strategy Pattern

The backend uses a Strategy Pattern to handle different generation types. Each strategy knows how to create the appropriate task for the Compute Server.

flowchart TB
    subgraph Routes ["API Routes"]
        R1[POST /generations]
        R2[POST /generations/edit]
        R3[POST /generations/refine]
        R4[POST /generations/video]
    end

    subgraph Workflow ["Workflow Service"]
        WF[GenerationWorkflowService]
    end

    subgraph Strategies ["Strategies"]
        S1[BaseGenerationStrategy]
        S2[EditingStrategy]
        S3[RefineStrategy]
        S4[VideoStrategy]
    end

    subgraph Compute ["Compute Server"]
        T1[GENERATION Task]
        T2[EDITING Task]
        T3[REFINE Task]
        T4[VIDEO Task]
    end

    R1 --> WF
    R2 --> WF
    R3 --> WF
    R4 --> WF

    WF --> S1 --> T1
    WF --> S2 --> T2
    WF --> S3 --> T3
    WF --> S4 --> T4

For detailed strategy documentation, see Generation Strategies.

Compute Server Tasks

The Compute Server supports multiple task types:

Task Type Description
GENERATION Standard AI image generation
EDITING Garment placement and compositing
REFINE Upscaling and enhancement
VIDEO_GENERATION Video from image
BACKGROUND_FIX Background manipulation
FACE_ENHANCER Facial detail enhancement
IMAGE_ADJUSTER Color and exposure adjustments
DETAIL_ENHANCER General detail enhancement

For task details, see Compute Server Tasks.


Data Model

Generation Record

When a user starts a generation, the Backend creates:

erDiagram
    Generation ||--o{ Prediction : contains
    Generation {
        uuid id
        string status
        string generation_type
        string generation_strategy
        int batch_size
        string generated_prompt
    }
    Prediction {
        uuid id
        uuid generation_id
        string status
        string task_id
        string result_image_url
        int seed
    }
Field Description
generation_type BASE or AGENTIC (orchestration mode)
generation_strategy Which strategy/task type to use
batch_size Number of predictions to create
generated_prompt The prompt sent to AI

Status Flow

stateDiagram-v2
    [*] --> PENDING: Created
    PENDING --> PROCESSING: Tasks submitted
    PROCESSING --> COMPLETED: All predictions done
    PROCESSING --> FAILED: Error occurred
    COMPLETED --> [*]
    FAILED --> [*]

Generation Status: - PENDING — Created, waiting to start - PROCESSING — Tasks running on Compute Server - COMPLETED — All predictions finished - FAILED — Generation failed

Prediction Status: - PENDINGGENERATINGCOMPLETED (or ERROR)


Standard Generation

The most common flow for creating on-model imagery.

API Request

POST /api/v1/generations
Content-Type: application/json

{
  "batch_size": 4,
  "subject_id": "uuid",
  "product_id": "uuid",
  "style_id": "uuid",
  "shot_type_string": "full_body",
  "auto_generate_prompt": true,
  "width": 1024,
  "height": 1024
}

Processing Steps

flowchart TB
    A[Receive Request] --> B[Validate Inputs]
    B --> C[Create Generation Record]
    C --> D[Create N Predictions]
    D --> E[Select Strategy]
    E --> F[Build Prompt]
    F --> G[Create Task per Prediction]
    G --> H[Submit to Compute Server]
    H --> I[Store Task IDs]
    I --> J[Emit WebSocket Event]
  1. Validate — Check subject, product, style belong to organization
  2. Create Records — Generation + N Predictions in database
  3. Build Prompt — Generate or use provided prompt
  4. Submit Tasks — One task per prediction to Compute Server
  5. Track — Store task_id on each prediction for result matching

Agentic Generation

Enhanced generation with quality evaluation and automatic retries.

flowchart TB
    subgraph Standard ["Standard Flow"]
        A[Create Generation]
        B[Submit Task]
        C[Get Result]
    end

    subgraph Agentic ["Agentic Additions"]
        D[Evaluate Quality]
        E{Acceptable?}
        F[Retry with Adjustments]
    end

    A --> B --> C --> D --> E
    E -->|Yes| G[Complete]
    E -->|No| F --> B

Additional features: - max_attempts — Maximum retry count - min_confidence_score — Quality threshold - evaluation_provider — Which evaluator to use - Creates OrchestratedShot records for orchestration metadata


Event Communication

Redis Event Stream

The Compute Server emits events to Redis, which the Backend subscribes to:

flowchart LR
    subgraph Compute ["Compute Server"]
        Task[Task Execution]
    end

    subgraph Redis ["Redis"]
        Stream[(Event Stream)]
    end

    subgraph Backend ["Backend"]
        Listener[Event Listener]
        Handler[Update Predictions]
    end

    Task -->|emit| Stream
    Stream -->|subscribe| Listener
    Listener --> Handler

Event Types

Event Trigger Backend Action
task.started Task begins Update prediction status
task.progress Progress update Broadcast via WebSocket
task.completed Task done Fetch result, update prediction
task.failed Task error Mark prediction as ERROR

Result Retrieval

When task.completed is received:

GET /api/v1/tasks/{task_id}/result

Response includes: - result_image_url — CDN URL to generated image - seed — Random seed used - metadata — Model info, timing, etc.

MediaResource Integration

Result images from the Compute Server are ingested into the MediaResource system:

Prediction / Refine results (relocate=True): 1. Backend receives the compute result path (e.g., compute/generations/{task_id}/result_0.webp) 2. Calls media_service.ingest_from_storage_key(prediction, "result_image", path, relocate=True) 3. File is copied from compute/ to canonical media/{resource_id}/file.{ext} 4. The intermediate compute/ path is queued for deferred deletion after commit 5. Old result_image_url field is also written for backward compatibility

Generation snapshots (relocate=False): - Snapshot URLs (e.g., product_snapshot_url) reference shared files that may be used by multiple entities - ingest_from_storage_key(generation, slot, path, relocate=False) keeps the file at its source path - No deferred deletion — the file remains accessible to other references

See MediaResource Lifecycle for the full data model and ingestion details.


Error Handling

Error Source Handling
Validation error Backend Return 400, no generation created
Provider timeout Compute Retry with backoff
Provider error Compute Mark prediction as ERROR
All predictions failed Backend Mark generation as FAILED

Failed Generation

sequenceDiagram
    participant C as Compute
    participant R as Redis
    participant B as Backend
    participant W as Webapp

    C->>R: Emit task.failed
    R-->>B: Event received
    B->>B: Mark Prediction ERROR
    B->>B: Check if all failed
    alt All predictions failed
        B->>B: Mark Generation FAILED
    end
    B-->>W: WebSocket notification

Performance

Typical Timing

Stage Duration
Request validation ~50ms
Record creation ~100ms
Task submission ~200ms
Queue wait ~500ms - 5s
AI generation ~8-15s
Result storage ~500ms
Total ~10-20s

Optimization

Strategy Impact
Batch predictions Parallel task execution
Prompt caching Faster prompt generation
CDN pre-warming Faster image delivery

Detailed Documentation