Google AI Integration¶

Sartiq integrates Google AI services through two distinct pathways: Gemini API for LLM tasks and Vertex AI for image generation.

Overview¶

Service	Purpose	SDK
Gemini API	LLM text generation, styling, captioning	`google-generativeai`, LiteLLM
Vertex AI	Image generation, editing, vision	`google-genai`

Service Comparison¶

Feature	Gemini API	Vertex AI
Authentication	API Key (Google AI Studio)	Service Account IAM or API Key + Project
Task Types	LLM Interaction only	Generation, Editing, LLM Interaction
Image Generation	Not supported	Native support
Image Editing	Not supported	Native support
Safety Settings	Stricter defaults	Fully configurable (can disable)
Use Case	Styling, captioning, analysis	Fashion image generation

Use Cases¶

Gemini API (LLM)¶

Used in the Backend API for text-based AI tasks:

Outfit Styling - Generate outfit recommendations from product catalogs
Image Captioning - Describe products for catalog generation
Product Filtering - Determine product visibility for shot types

Vertex AI (Image)¶

Used in the Compute Server for image tasks:

Image Generation - Text-to-image with optional references
Image Editing - Virtual try-on, garment replacement
Vision Analysis - Describe and analyze fashion photographs

Architecture¶

flowchart TB
    subgraph Backend["Backend API"]
        SS[StylingService]
        CS[CaptioningService]
        GP[GeminiProvider]
        LL[LiteLLM Backend]
    end

    subgraph Compute["Compute Server"]
        VP[VertexProcessor]
        GC[google-genai Client]
    end

    subgraph Google["Google Cloud"]
        GAPI[Gemini API]
        VAI[Vertex AI]
    end

    SS --> GP
    CS --> GP
    GP --> LL
    LL --> GAPI

    VP --> GC
    GC --> VAI

Models¶

Gemini API Models¶

Alias	Model ID	Use Case
`gemini-2.5-flash`	`gemini-2.5-flash-preview-05-20`	Captioning
`gemini-2.0-flash`	`gemini-2.0-flash-latest`	Styling
`gemini-flash-latest`	`gemini-flash-latest`	LLM visibility
`gemini-1.5-pro`	`gemini-1.5-pro-latest`	Complex analysis

Vertex AI Models¶

Model	Use Case
`gemini-3-pro-image-preview`	Image generation/editing
`gemini-3-pro-preview`	Text-only tasks

Key Differences¶

Safety Settings¶

Gemini API (stricter):

safety_settings = [
    {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
    {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
    {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
    {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
]

Vertex AI (permissive for fashion):

safety_settings = [
    SafetySetting(category=HARM_CATEGORY_HARASSMENT, threshold=OFF),
    SafetySetting(category=HARM_CATEGORY_HATE_SPEECH, threshold=OFF),
    SafetySetting(category=HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=OFF),
    SafetySetting(category=HARM_CATEGORY_DANGEROUS_CONTENT, threshold=OFF),
]

Response Format¶

Gemini API - Structured JSON via response_mime_type:

config = GenerateContentConfig(response_mime_type="application/json")

Vertex AI - Direct image/text output with metadata

Documentation¶

Section	Description
Setup	Configuration for both services
Prompting	Best practices for Gemini
Examples	Code examples