Anthropic Integration¶
Sartiq integrates Anthropic's Claude API for image captioning, vision analysis, and LLM-powered outfit generation.
Overview¶
| Property | Value |
|---|---|
| Provider | Anthropic |
| Primary Models | Claude Sonnet 4, Claude 3.5 Sonnet |
| Features Used | Vision, Structured Output (JSON), Text Generation |
| SDK | anthropic Python SDK v0.40+ |
Use Cases¶
1. Image Captioning¶
Claude's vision capabilities analyze product images to generate detailed technical descriptions for ML training and catalog generation.
flowchart LR
IMG[Product Image] --> PREP[Image Preparation]
PREP --> B64[Base64 Encoding]
B64 --> API[Claude Vision API]
API --> JSON[JSON Response]
JSON --> CAP[Caption Output]
Supported Caption Types:
| Type | Description | Use Case |
|---|---|---|
GARMENT |
Clothing construction, materials, patterns | ML training data |
PRODUCT |
Functional features, hardware, materials | Product catalog |
PERSON |
Physical characteristics, demographics | Subject casting |
2. Outfit Generation¶
Claude analyzes products and generates styling recommendations through the LLM interaction system, returning structured JSON with outfit compositions.
3. Vision Analysis¶
Multi-image analysis for product understanding, style matching, and quality assessment.
Architecture¶
flowchart TB
subgraph Backend["Backend API"]
CS[CaptionsService]
LLM[LLM Interaction Handler]
AIC[AI Core Client]
end
subgraph Providers["Provider Layer"]
AP[AnthropicProvider]
LB[LiteLLM Backend]
end
subgraph External["Anthropic API"]
CLAUDE[Claude API]
end
CS --> AP
LLM --> AIC
AIC --> AP
AP --> LB
LB --> CLAUDE
Models¶
| Alias | Model ID | Use Case |
|---|---|---|
claude |
claude-sonnet-4-20250514 |
Default, captioning |
sonnet |
claude-sonnet-4-20250514 |
General tasks |
claude-3-5-sonnet |
claude-3-5-sonnet-20241022 |
Outfit generation |
haiku |
claude-3-haiku-20240307 |
Lightweight operations |
opus |
claude-opus-4-20250514 |
Advanced analysis |
Key Features¶
Structured JSON Output¶
All integrations request JSON responses with defined schemas:
# Captioning response schema
{
"technical_description": "...",
"short_description": "..."
}
# Outfit response schema
{
"core_product": {...},
"looks": [...]
}
Vision Capabilities¶
- Base64 image encoding with automatic format detection
- Size optimization (max 5MB)
- Support for JPEG, PNG, WebP, GIF
- Multi-image analysis
Error Handling¶
- Retry logic with 3 attempts on failure
- Graceful JSON parsing (handles markdown code blocks)
- Rate limit guidance with model fallback suggestions
Documentation¶
| Section | Description |
|---|---|
| Setup | Configuration and API keys |
| Prompting | System prompts and best practices |
| Examples | Code examples and workflows |