Skip to content

Google AI Integration

Sartiq integrates Google AI services through two distinct pathways: Gemini API for LLM tasks and Vertex AI for image generation.


Overview

Service Purpose SDK
Gemini API LLM text generation, styling, captioning google-generativeai, LiteLLM
Vertex AI Image generation, editing, vision google-genai

Service Comparison

Feature Gemini API Vertex AI
Authentication API Key (Google AI Studio) Service Account IAM or API Key + Project
Task Types LLM Interaction only Generation, Editing, LLM Interaction
Image Generation Not supported Native support
Image Editing Not supported Native support
Safety Settings Stricter defaults Fully configurable (can disable)
Use Case Styling, captioning, analysis Fashion image generation

Use Cases

Gemini API (LLM)

Used in the Backend API for text-based AI tasks:

  1. Outfit Styling - Generate outfit recommendations from product catalogs
  2. Image Captioning - Describe products for catalog generation
  3. Product Filtering - Determine product visibility for shot types

Vertex AI (Image)

Used in the Compute Server for image tasks:

  1. Image Generation - Text-to-image with optional references
  2. Image Editing - Virtual try-on, garment replacement
  3. Vision Analysis - Describe and analyze fashion photographs

Architecture

flowchart TB
    subgraph Backend["Backend API"]
        SS[StylingService]
        CS[CaptioningService]
        GP[GeminiProvider]
        LL[LiteLLM Backend]
    end

    subgraph Compute["Compute Server"]
        VP[VertexProcessor]
        GC[google-genai Client]
    end

    subgraph Google["Google Cloud"]
        GAPI[Gemini API]
        VAI[Vertex AI]
    end

    SS --> GP
    CS --> GP
    GP --> LL
    LL --> GAPI

    VP --> GC
    GC --> VAI

Models

Gemini API Models

Alias Model ID Use Case
gemini-2.5-flash gemini-2.5-flash-preview-05-20 Captioning
gemini-2.0-flash gemini-2.0-flash-latest Styling
gemini-flash-latest gemini-flash-latest LLM visibility
gemini-1.5-pro gemini-1.5-pro-latest Complex analysis

Vertex AI Models

Model Use Case
gemini-3-pro-image-preview Image generation/editing
gemini-3-pro-preview Text-only tasks

Key Differences

Safety Settings

Gemini API (stricter):

safety_settings = [
    {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
    {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
    {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
    {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
]

Vertex AI (permissive for fashion):

safety_settings = [
    SafetySetting(category=HARM_CATEGORY_HARASSMENT, threshold=OFF),
    SafetySetting(category=HARM_CATEGORY_HATE_SPEECH, threshold=OFF),
    SafetySetting(category=HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=OFF),
    SafetySetting(category=HARM_CATEGORY_DANGEROUS_CONTENT, threshold=OFF),
]

Response Format

Gemini API - Structured JSON via response_mime_type:

config = GenerateContentConfig(response_mime_type="application/json")

Vertex AI - Direct image/text output with metadata


Documentation

Section Description
Setup Configuration for both services
Prompting Best practices for Gemini
Examples Code examples