Generation Pipeline¶
This document describes how generation requests flow through the Sartiq platform, from user configuration through AI processing to final delivery.
Overview¶
The generation pipeline transforms creative inputs into professional imagery:
flowchart TB
subgraph Inputs ["Inputs"]
direction LR
P[Product] ~~~ S[Subject]
ST[Style] ~~~ G[Guidelines]
end
subgraph Pipeline ["Pipeline"]
direction LR
C[Configure] --> Q[Queue] --> O[Orchestrate]
O --> AI[Generate] --> PP[Post-Process]
end
subgraph Output ["Output"]
direction LR
I[Image] ~~~ M[Metadata]
end
Inputs --> Pipeline --> Output
Key concepts:
| Concept | Description |
|---|---|
| Generation | A task that produces one or more images |
| Prediction | A single image output from a generation |
| Strategy | The type of task (generate, edit, refine, video, etc.) |
| Task | A unit of work sent to the Compute Server |
Generation Types¶
Sartiq supports multiple generation types, each optimized for different use cases:
flowchart TB
subgraph Standard ["Standard Generations"]
BASE[Base Generation]
AGENTIC[Agentic Generation]
end
subgraph Specialized ["Specialized Tasks"]
EDIT[Editing]
REFINE[Refinement]
VIDEO[Video]
end
subgraph Enhancement ["Enhancement Tasks"]
BG[Background Fix]
FACE[Face Fixer]
ADJ[Image Adjuster]
end
| Type | Purpose | When to Use |
|---|---|---|
| Base | Standard image generation | Most on-model imagery |
| Agentic | Generation with quality evaluation and retries | High-stakes outputs |
| Editing | Garment placement and compositing | Product swaps, try-on |
| Refine | Upscaling and detail enhancement | Final output preparation |
| Video | Generate video from images | Motion content |
| Background Fix | Fix or replace backgrounds | Post-processing |
| Face Fixer | Enhance facial details | Portrait refinement |
| Image Adjuster | General image adjustments | Color/exposure fixes |
For detailed information on each type, see Generation Types.
High-Level Flow¶
sequenceDiagram
participant U as User
participant W as Webapp
participant B as Backend
participant R as Redis
participant C as Compute
participant AI
rect rgb(240, 248, 255)
Note over U,B: 1 · Configure
U->>W: Set generation options
W->>B: POST /generations
B->>B: Create Generation + Predictions
B-->>W: Generation ID
end
rect rgb(240, 255, 240)
Note over B,AI: 2 · Process
B->>B: Select Strategy
B->>C: Submit Task(s)
C-->>B: Task ID(s)
C->>AI: Execute generation
AI-->>C: Result images
C->>C: Convert to WebP + embed ICC (Display P3)
C->>C: Save to compute/{type}s/{task_id}/
C->>R: Emit task.completed
end
rect rgb(255, 250, 240)
Note over U,C: 3 · Complete
R-->>B: Event received
B->>C: GET /tasks/{id}/result
B->>B: Copy from compute/ to images/generations/
B->>B: Update Predictions
B-->>W: WebSocket update
W-->>U: Display results
end
Architecture¶
Strategy Pattern¶
The backend uses a Strategy Pattern to handle different generation types. Each strategy knows how to create the appropriate task for the Compute Server.
flowchart TB
subgraph Routes ["API Routes"]
R1[POST /generations]
R2[POST /generations/edit]
R3[POST /generations/refine]
R4[POST /generations/video]
end
subgraph Workflow ["Workflow Service"]
WF[GenerationWorkflowService]
end
subgraph Strategies ["Strategies"]
S1[BaseGenerationStrategy]
S2[EditingStrategy]
S3[RefineStrategy]
S4[VideoStrategy]
end
subgraph Compute ["Compute Server"]
T1[GENERATION Task]
T2[EDITING Task]
T3[REFINE Task]
T4[VIDEO Task]
end
R1 --> WF
R2 --> WF
R3 --> WF
R4 --> WF
WF --> S1 --> T1
WF --> S2 --> T2
WF --> S3 --> T3
WF --> S4 --> T4
For detailed strategy documentation, see Generation Strategies.
Compute Server Tasks¶
The Compute Server supports multiple task types:
| Task Type | Description |
|---|---|
GENERATION |
Standard AI image generation |
EDITING |
Garment placement and compositing |
REFINE |
Upscaling and enhancement |
VIDEO_GENERATION |
Video from image |
BACKGROUND_FIX |
Background manipulation |
FACE_ENHANCER |
Facial detail enhancement |
IMAGE_ADJUSTER |
Color and exposure adjustments |
DETAIL_ENHANCER |
General detail enhancement |
For task details, see Compute Server Tasks.
Data Model¶
Generation Record¶
When a user starts a generation, the Backend creates:
erDiagram
Generation ||--o{ Prediction : contains
Generation {
uuid id
string status
string generation_type
string generation_strategy
int batch_size
string generated_prompt
}
Prediction {
uuid id
uuid generation_id
string status
string task_id
string result_image_url
int seed
}
| Field | Description |
|---|---|
generation_type |
BASE or AGENTIC (orchestration mode) |
generation_strategy |
Which strategy/task type to use |
batch_size |
Number of predictions to create |
generated_prompt |
The prompt sent to AI |
Status Flow¶
stateDiagram-v2
[*] --> PENDING: Created
PENDING --> PROCESSING: Tasks submitted
PROCESSING --> COMPLETED: All predictions done
PROCESSING --> FAILED: Error occurred
COMPLETED --> [*]
FAILED --> [*]
Generation Status:
- PENDING — Created, waiting to start
- PROCESSING — Tasks running on Compute Server
- COMPLETED — All predictions finished
- FAILED — Generation failed
Prediction Status:
- PENDING → GENERATING → COMPLETED (or ERROR)
Standard Generation¶
The most common flow for creating on-model imagery.
API Request¶
POST /api/v1/generations
Content-Type: application/json
{
"batch_size": 4,
"subject_id": "uuid",
"product_id": "uuid",
"style_id": "uuid",
"shot_type_string": "full_body",
"auto_generate_prompt": true,
"width": 1024,
"height": 1024
}
Processing Steps¶
flowchart TB
A[Receive Request] --> B[Validate Inputs]
B --> C[Create Generation Record]
C --> D[Create N Predictions]
D --> E[Select Strategy]
E --> F[Build Prompt]
F --> G[Create Task per Prediction]
G --> H[Submit to Compute Server]
H --> I[Store Task IDs]
I --> J[Emit WebSocket Event]
- Validate — Check subject, product, style belong to organization
- Create Records — Generation + N Predictions in database
- Build Prompt — Generate or use provided prompt
- Submit Tasks — One task per prediction to Compute Server
- Track — Store task_id on each prediction for result matching
Agentic Generation¶
Enhanced generation with quality evaluation and automatic retries.
flowchart TB
subgraph Standard ["Standard Flow"]
A[Create Generation]
B[Submit Task]
C[Get Result]
end
subgraph Agentic ["Agentic Additions"]
D[Evaluate Quality]
E{Acceptable?}
F[Retry with Adjustments]
end
A --> B --> C --> D --> E
E -->|Yes| G[Complete]
E -->|No| F --> B
Additional features:
- max_attempts — Maximum retry count
- min_confidence_score — Quality threshold
- evaluation_provider — Which evaluator to use
- Creates OrchestratedShot records for orchestration metadata
Event Communication¶
Redis Event Stream¶
The Compute Server emits events to Redis, which the Backend subscribes to:
flowchart LR
subgraph Compute ["Compute Server"]
Task[Task Execution]
end
subgraph Redis ["Redis"]
Stream[(Event Stream)]
end
subgraph Backend ["Backend"]
Listener[Event Listener]
Handler[Update Predictions]
end
Task -->|emit| Stream
Stream -->|subscribe| Listener
Listener --> Handler
Event Types¶
| Event | Trigger | Backend Action |
|---|---|---|
task.started |
Task begins | Update prediction status |
task.progress |
Progress update | Broadcast via WebSocket |
task.completed |
Task done | Fetch result, update prediction |
task.failed |
Task error | Mark prediction as ERROR |
Result Retrieval¶
When task.completed is received:
Response includes:
- result_image_url — CDN URL to generated image
- seed — Random seed used
- metadata — Model info, timing, etc.
MediaResource Integration¶
Result images from the Compute Server are ingested into the MediaResource system:
Prediction / Refine results (relocate=True):
1. Backend receives the compute result path (e.g., compute/generations/{task_id}/result_0.webp)
2. Calls media_service.ingest_from_storage_key(prediction, "result_image", path, relocate=True)
3. File is copied from compute/ to canonical media/{resource_id}/file.{ext}
4. The intermediate compute/ path is queued for deferred deletion after commit
5. Old result_image_url field is also written for backward compatibility
Generation snapshots (relocate=False):
- Snapshot URLs (e.g., product_snapshot_url) reference shared files that may be used by multiple entities
- ingest_from_storage_key(generation, slot, path, relocate=False) keeps the file at its source path
- No deferred deletion — the file remains accessible to other references
See MediaResource Lifecycle for the full data model and ingestion details.
Error Handling¶
| Error | Source | Handling |
|---|---|---|
| Validation error | Backend | Return 400, no generation created |
| Provider timeout | Compute | Retry with backoff |
| Provider error | Compute | Mark prediction as ERROR |
| All predictions failed | Backend | Mark generation as FAILED |
Failed Generation¶
sequenceDiagram
participant C as Compute
participant R as Redis
participant B as Backend
participant W as Webapp
C->>R: Emit task.failed
R-->>B: Event received
B->>B: Mark Prediction ERROR
B->>B: Check if all failed
alt All predictions failed
B->>B: Mark Generation FAILED
end
B-->>W: WebSocket notification
Performance¶
Typical Timing¶
| Stage | Duration |
|---|---|
| Request validation | ~50ms |
| Record creation | ~100ms |
| Task submission | ~200ms |
| Queue wait | ~500ms - 5s |
| AI generation | ~8-15s |
| Result storage | ~500ms |
| Total | ~10-20s |
Optimization¶
| Strategy | Impact |
|---|---|
| Batch predictions | Parallel task execution |
| Prompt caching | Faster prompt generation |
| CDN pre-warming | Faster image delivery |
Detailed Documentation¶
- Generation Types — Detailed breakdown of each type
- Generation Strategies — Strategy pattern implementation
- Compute Server Tasks — Task types and parameters
Related Documentation¶
- Compute Server — AI orchestration details
- Backend Architecture — API implementation
- Delivery & Export — What happens after generation