Google AI Integration¶
Sartiq integrates Google AI services through two distinct pathways: Gemini API for LLM tasks and Vertex AI for image generation.
Overview¶
| Service | Purpose | SDK |
|---|---|---|
| Gemini API | LLM text generation, styling, captioning | google-generativeai, LiteLLM |
| Vertex AI | Image generation, editing, vision | google-genai |
Service Comparison¶
| Feature | Gemini API | Vertex AI |
|---|---|---|
| Authentication | API Key (Google AI Studio) | Service Account IAM or API Key + Project |
| Task Types | LLM Interaction only | Generation, Editing, LLM Interaction |
| Image Generation | Not supported | Native support |
| Image Editing | Not supported | Native support |
| Safety Settings | Stricter defaults | Fully configurable (can disable) |
| Use Case | Styling, captioning, analysis | Fashion image generation |
Use Cases¶
Gemini API (LLM)¶
Used in the Backend API for text-based AI tasks:
- Outfit Styling - Generate outfit recommendations from product catalogs
- Image Captioning - Describe products for catalog generation
- Product Filtering - Determine product visibility for shot types
Vertex AI (Image)¶
Used in the Compute Server for image tasks:
- Image Generation - Text-to-image with optional references
- Image Editing - Virtual try-on, garment replacement
- Vision Analysis - Describe and analyze fashion photographs
Architecture¶
flowchart TB
subgraph Backend["Backend API"]
SS[StylingService]
CS[CaptioningService]
GP[GeminiProvider]
LL[LiteLLM Backend]
end
subgraph Compute["Compute Server"]
VP[VertexProcessor]
GC[google-genai Client]
end
subgraph Google["Google Cloud"]
GAPI[Gemini API]
VAI[Vertex AI]
end
SS --> GP
CS --> GP
GP --> LL
LL --> GAPI
VP --> GC
GC --> VAI
Models¶
Gemini API Models¶
| Alias | Model ID | Use Case |
|---|---|---|
gemini-2.5-flash |
gemini-2.5-flash-preview-05-20 |
Captioning |
gemini-2.0-flash |
gemini-2.0-flash-latest |
Styling |
gemini-flash-latest |
gemini-flash-latest |
LLM visibility |
gemini-1.5-pro |
gemini-1.5-pro-latest |
Complex analysis |
Vertex AI Models¶
| Model | Use Case |
|---|---|
gemini-3-pro-image-preview |
Image generation/editing |
gemini-3-pro-preview |
Text-only tasks |
Key Differences¶
Safety Settings¶
Gemini API (stricter):
safety_settings = [
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
]
Vertex AI (permissive for fashion):
safety_settings = [
SafetySetting(category=HARM_CATEGORY_HARASSMENT, threshold=OFF),
SafetySetting(category=HARM_CATEGORY_HATE_SPEECH, threshold=OFF),
SafetySetting(category=HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=OFF),
SafetySetting(category=HARM_CATEGORY_DANGEROUS_CONTENT, threshold=OFF),
]
Response Format¶
Gemini API - Structured JSON via response_mime_type:
Vertex AI - Direct image/text output with metadata
Documentation¶
| Section | Description |
|---|---|
| Setup | Configuration for both services |
| Prompting | Best practices for Gemini |
| Examples | Code examples |