Skip to content

Anthropic Integration

Sartiq integrates Anthropic's Claude API for image captioning, vision analysis, and LLM-powered outfit generation.


Overview

Property Value
Provider Anthropic
Primary Models Claude Sonnet 4, Claude 3.5 Sonnet
Features Used Vision, Structured Output (JSON), Text Generation
SDK anthropic Python SDK v0.40+

Use Cases

1. Image Captioning

Claude's vision capabilities analyze product images to generate detailed technical descriptions for ML training and catalog generation.

flowchart LR
    IMG[Product Image] --> PREP[Image Preparation]
    PREP --> B64[Base64 Encoding]
    B64 --> API[Claude Vision API]
    API --> JSON[JSON Response]
    JSON --> CAP[Caption Output]

Supported Caption Types:

Type Description Use Case
GARMENT Clothing construction, materials, patterns ML training data
PRODUCT Functional features, hardware, materials Product catalog
PERSON Physical characteristics, demographics Subject casting

2. Outfit Generation

Claude analyzes products and generates styling recommendations through the LLM interaction system, returning structured JSON with outfit compositions.

3. Vision Analysis

Multi-image analysis for product understanding, style matching, and quality assessment.


Architecture

flowchart TB
    subgraph Backend["Backend API"]
        CS[CaptionsService]
        LLM[LLM Interaction Handler]
        AIC[AI Core Client]
    end

    subgraph Providers["Provider Layer"]
        AP[AnthropicProvider]
        LB[LiteLLM Backend]
    end

    subgraph External["Anthropic API"]
        CLAUDE[Claude API]
    end

    CS --> AP
    LLM --> AIC
    AIC --> AP
    AP --> LB
    LB --> CLAUDE

Models

Alias Model ID Use Case
claude claude-sonnet-4-20250514 Default, captioning
sonnet claude-sonnet-4-20250514 General tasks
claude-3-5-sonnet claude-3-5-sonnet-20241022 Outfit generation
haiku claude-3-haiku-20240307 Lightweight operations
opus claude-opus-4-20250514 Advanced analysis

Key Features

Structured JSON Output

All integrations request JSON responses with defined schemas:

# Captioning response schema
{
    "technical_description": "...",
    "short_description": "..."
}

# Outfit response schema
{
    "core_product": {...},
    "looks": [...]
}

Vision Capabilities

  • Base64 image encoding with automatic format detection
  • Size optimization (max 5MB)
  • Support for JPEG, PNG, WebP, GIF
  • Multi-image analysis

Error Handling

  • Retry logic with 3 attempts on failure
  • Graceful JSON parsing (handles markdown code blocks)
  • Rate limit guidance with model fallback suggestions

Documentation

Section Description
Setup Configuration and API keys
Prompting System prompts and best practices
Examples Code examples and workflows