Product Ingestion Flow¶
This document describes how products flow from user uploads through validation, processing, and cataloging in the Sartiq platform.
Overview¶
Product ingestion uses a two-phase upload process:
- File Upload Phase — Files are uploaded directly to R2 via presigned URLs
- Product Creation Phase — Files are relocated from temporary to permanent storage and the product record is created
flowchart TB
subgraph Phase1 ["1 · File Upload"]
A[Select Files] --> B[Request Presigned URL]
B --> C[PUT File Directly to R2]
end
subgraph Phase2 ["2 · Product Creation"]
D[Submit Product Data] --> E[Relocate Files from temp/]
E --> F[Process Images]
F --> G[Generate Caption]
G --> H[Save to Database]
end
subgraph Phase3 ["3 · Content Delivery"]
I[(R2 Storage)] --> J[CDN]
J --> K[Fast Global Access]
end
Phase1 --> Phase2
Phase2 --> Phase3
Sequence Diagram¶
sequenceDiagram
participant U as User
participant W as Webapp
participant B as Backend
participant R2 as Cloudflare R2
participant C as CDN
rect rgb(240, 248, 255)
Note over U,R2: 1 · File Upload
U->>W: Select images
W->>B: POST /api/v1/uploads/presigned-url
B-->>W: {upload_url, file_url, expires_in_seconds}
W->>R2: PUT file directly to presigned URL
R2-->>W: 200 OK
Note over R2: File stored in temp/{file_id}_{ts}_{name}
end
rect rgb(240, 255, 240)
Note over U,R2: 2 · Product Creation
W->>B: POST /api/v1/products (with temp file URLs)
B->>R2: Relocate: temp/ → images/products/{id}/...
B->>B: Process images (format, ICC profile)
B->>B: Generate caption (via Compute Server)
B-->>W: Product with CDN URLs
end
rect rgb(255, 250, 240)
Note over U,C: 3 · Content Delivery
R2-->>C: Serve via CDN
W->>C: Request images
C-->>W: Cached content
W-->>U: Display product
end
Phase 1: File Upload¶
The platform uses presigned URLs for secure, direct-to-R2 uploads. Files are uploaded to temporary storage, then relocated during product creation.
Step 1.1: Request Presigned URL¶
The webapp requests a presigned upload URL from the backend:
POST /api/v1/uploads/presigned-url
Content-Type: application/json
{
"filename": "product-001.jpg",
"content_type": "image/jpeg",
"size": 2048576
}
Response:
| Field | Type | Description |
|---|---|---|
upload_url |
string | Presigned R2 URL for direct PUT upload |
file_url |
string | CDN URL where file will be accessible after upload |
upload_method |
string | PUT |
expires_in_seconds |
integer | Token validity (dynamic based on file size, default 15 min) |
max_file_size |
integer | Maximum file size in bytes (52428800 = 50MB) |
Step 1.2: Direct Upload to R2¶
The webapp uploads the file directly to the presigned R2 URL using a PUT request:
- The file lands in R2 at
temp/{file_id}_{timestamp}_{safe_filename} - The Backend is not in the upload data path — the file goes directly from browser to R2
- Upload progress is tracked client-side via XHR upload events
Upload Configuration¶
| Setting | Value |
|---|---|
| Max file size | 50MB |
| Token expiration | 15 minutes (dynamic based on file size) |
| Concurrent uploads | 3 simultaneous |
| Retry attempts | 3 with exponential backoff (1s → 2s → 4s, max 10s) |
Validation¶
Validation happens at multiple layers:
Frontend Validation¶
Before requesting a presigned URL, the webapp validates:
| Check | Requirement |
|---|---|
| File type | image/jpeg, image/png, image/webp, image/gif, image/avif |
| File size | Max 50MB |
| File count | Depends on upload context (max 10 per batch) |
Backend Validation¶
The backend validates during presigned URL generation:
| Check | Details |
|---|---|
| Content type | Must be in allowed list (images, video, PDF, ZIP, audio/mpeg) |
| File size | Must be within 50MB limit |
| Filename | Security sanitization (prevent directory traversal) |
Phase 2: Product Creation¶
Once files are uploaded to temporary storage, the webapp submits the product creation request.
Step 2.1: Create Product¶
POST /api/v1/products?mode=basic
Content-Type: application/json
{
"name": "Summer Dress",
"cover_image": "{R2_PUBLIC_URL}/temp/{file_id}_cover.jpg",
"back_image": "{R2_PUBLIC_URL}/temp/{file_id}_back.jpg",
"reference_images": ["{R2_PUBLIC_URL}/temp/{file_id}_ref1.jpg"],
"sku": "SKU-001",
"product_family": "APPAREL",
"product_type": "DRESS",
"gender": "FEMALE"
}
Step 2.2: Backend Processing¶
The backend performs these operations:
flowchart TB
A[Create Product Record] --> B[Relocate Cover Image]
B --> C[Relocate Back Image]
C --> D[Relocate Reference Images]
D --> E[Process Images]
E --> F[Generate Caption]
F --> G[Build CDN URLs]
G --> H[Update Product Record]
File Relocation (via MediaResource):
media_service.ingest_all(product, product_in)processes all media fields declared on the input schema- For each file: HEAD the temp upload, compute content hash, check for dedup
- Dedup hit: delete temp file, reuse existing
MediaResource, createMediaResourceAttachment - Dedup miss: create
MediaResource, relocate temp file to canonical pathmedia/{resource_id}/file.{ext}, create attachment - During the migration period, old
*_urlfields (e.g.,cover_image_url) are also written for backward compatibility - Relative paths are stored in the database; CDN URLs are resolved at serving time
- See MediaResource Lifecycle for the full ingestion flow
Image Processing:
| Strategy | Behavior |
|---|---|
| Preserve Original | Keep original format, embed ICC profile if missing |
| Convert to Target | Always convert to WebP at 95% quality (default) |
| Smart Conversion | Convert only if format differs from target |
- AVIF uploads are converted to PNG for pipeline compatibility
- ICC profiles (Display P3) are embedded when missing
- Image dimensions are extracted and stored as metadata
Caption Generation:
- Sent to Compute Server for AI-powered description
- Can be sync (wait) or async (background)
Product Images¶
| Image Type | Purpose | Required |
|---|---|---|
| Cover Image | Primary front-facing product photo | Yes |
| Back Image | Back view of the product | No |
| Reference Images | Additional angles/details | No |
Storage & CDN¶
Product images are stored in Cloudflare R2 and served globally through Cloudflare CDN.
Storage Architecture¶
flowchart LR
subgraph Writers ["Writers"]
API[Backend]
end
subgraph Cloudflare ["Cloudflare"]
R2[(R2 Storage)]
CDN[CDN]
end
subgraph Readers ["Readers"]
Webapp[Web App]
Compute[Compute Server]
end
API -- write --> R2
R2 --> CDN
Webapp -. read .-> CDN
Compute -. read .-> CDN
Storage Structure¶
{R2_BUCKET_NAME}/
├── temp/ # Presigned upload staging
│ └── {file_id}_{timestamp}_{safe_filename}
│
├── media/{resource_id}/ # Canonical (new uploads)
│ └── file.{ext}
│
└── images/products/{product_id}/ # Legacy (pre-existing data)
└── product_{product_id}_{sku}_{uuid}_{image_type}.{ext}
Image slots: cover_image, back_image, reference_images (list)
URL Resolution¶
Product records store relative file paths. The Backend's file serving endpoint (GET /files/{file_path}) returns a 302 redirect to the CDN URL. Both legacy and canonical paths resolve correctly:
| Environment | Canonical URL | Legacy URL |
|---|---|---|
| Production | https://media.sartiq.com/media/{resource_id}/file.webp |
https://media.sartiq.com/images/products/{id}/product_... |
| Development | http://localhost:9002/shootify-media-dev/media/{resource_id}/file.webp |
http://localhost:9002/shootify-media-dev/images/products/{id}/product_... |
Error Handling¶
Upload Errors¶
| Error | Cause | Resolution |
|---|---|---|
| Token expired | Presigned URL timed out | Request new presigned URL |
| File too large | Exceeds 50MB limit | Compress or resize image |
| Invalid type | Unsupported file format | Use JPEG, PNG, WebP, GIF, or AVIF |
| Upload failed | Network interruption | Automatic retry (3 attempts, exponential backoff) |
Product Creation Errors¶
| Error | Cause | Resolution |
|---|---|---|
| Missing cover image | Cover image required | Provide cover image |
| Missing name | Product name required | Provide product name |
| File not found | Temp file expired | Re-upload files |
Bulk Import¶
For large catalogs, brands can use bulk import via CSV.
CSV Format¶
sku,name,category,image_url,metadata
SKU001,Summer Dress,dresses,https://brand.com/images/dress1.jpg,"{""color"":""blue""}"
SKU002,Winter Coat,outerwear,https://brand.com/images/coat1.jpg,"{""color"":""black""}"
Bulk Import Flow¶
sequenceDiagram
participant U as User
participant W as Webapp
participant B as Backend
participant R2 as R2 Storage
participant C as CDN
rect rgb(240, 248, 255)
Note over U,B: 1 · Submit Import
U->>W: Upload CSV
W->>B: Start import job
B-->>W: Job ID
end
rect rgb(240, 255, 240)
Note over B,R2: 2 · Process (Background)
B->>B: Fetch external images
B->>B: Validate & process
B->>R2: Store to R2
end
rect rgb(255, 250, 240)
Note over U,C: 3 · Complete
R2-->>C: Serve via CDN
B-->>W: Job complete
W-->>U: Import results
end
After import, all product images are immediately available via CDN URLs stored in the database records.
Monitoring¶
Key Metrics¶
| Metric | Description | Alert Threshold |
|---|---|---|
upload_success_rate |
% successful uploads | < 95% |
processing_time_p99 |
99th percentile processing time | > 30s |
storage_errors |
R2 upload failures | > 0 in 5min |
cdn_cache_hit_rate |
% requests served from CDN cache | < 90% |
validation_failures |
Invalid files submitted | Informational |
Related Documentation¶
- Generation Pipeline - How products are used in generation
- Backend Architecture - API implementation
- Storage Infrastructure - Storage and CDN details