Skip to content

Rate Limits

The Compute API implements rate limiting to ensure fair resource allocation and system stability.


Rate Limit Tiers

Rate limits are applied per organization based on subscription tier.

Tier Requests/min Concurrent Tasks Daily Tasks
Starter 60 5 500
Professional 300 25 5,000
Enterprise 1,000 100 Unlimited

Rate Limit Headers

All responses include rate limit information:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1705320000
X-RateLimit-Retry-After: 30
Header Description
X-RateLimit-Limit Maximum requests per window
X-RateLimit-Remaining Requests remaining in window
X-RateLimit-Reset Unix timestamp when window resets
X-RateLimit-Retry-After Seconds until retry (when limited)

Rate Limit Response

When rate limited, the API returns:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 30

{
  "detail": "Rate limit exceeded",
  "retry_after": 30
}

Endpoint-Specific Limits

Some endpoints have additional limits:

Endpoint Limit Window
POST /tasks/generation 30/min Per organization
POST /training/ 5/hour Per organization
POST /workflows/ 20/min Per organization
GET /monitoring/* 120/min Per user

Concurrent Task Limits

Beyond request rate limits, there are limits on concurrent running tasks:

flowchart LR
    subgraph Limits["Task Concurrency"]
        GEN[Generation: 10 concurrent]
        TRAIN[Training: 2 concurrent]
        WF[Workflows: 5 concurrent]
    end

When at capacity, new task submissions are queued (not rejected).

Queue Behavior

Scenario Behavior
Under limit Task starts immediately
At limit Task queued, starts when slot available
Queue full 503 Service Unavailable

Best Practices

Handling Rate Limits

import time
import httpx

def make_request_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        response = httpx.get(url, headers=headers)

        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 30))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            continue

        return response

    raise Exception("Max retries exceeded")
async function makeRequestWithRetry(
  url: string,
  headers: HeadersInit,
  maxRetries = 3
): Promise<Response> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, { headers });

    if (response.status === 429) {
      const retryAfter = parseInt(
        response.headers.get("Retry-After") || "30"
      );
      console.log(`Rate limited. Waiting ${retryAfter}s...`);
      await new Promise((r) => setTimeout(r, retryAfter * 1000));
      continue;
    }

    return response;
  }

  throw new Error("Max retries exceeded");
}

Batch Operations

Instead of many individual requests, use batch endpoints where available:

# Instead of multiple single requests
curl -X POST .../tasks/generation -d '{"image": "img1.jpg"}'
curl -X POST .../tasks/generation -d '{"image": "img2.jpg"}'
curl -X POST .../tasks/generation -d '{"image": "img3.jpg"}'

# Use batch endpoint
curl -X POST .../tasks/generation/batch -d '{
  "tasks": [
    {"image": "img1.jpg"},
    {"image": "img2.jpg"},
    {"image": "img3.jpg"}
  ]
}'

Monitor Usage

Check your current usage via the monitoring endpoints:

curl https://compute-api.sartiq.com/api/v1/monitoring/stats \
  -H "Authorization: Bearer $TOKEN"

Quota Management

View Current Quota

curl https://compute-api.sartiq.com/api/v1/monitoring/quota \
  -H "Authorization: Bearer $TOKEN"
{
  "tier": "professional",
  "daily_tasks": {
    "limit": 5000,
    "used": 1234,
    "remaining": 3766
  },
  "concurrent_tasks": {
    "limit": 25,
    "active": 8
  },
  "resets_at": "2024-01-16T00:00:00Z"
}

Quota Exceeded

When daily quota is exceeded:

{
  "detail": "Daily task quota exceeded",
  "quota_resets_at": "2024-01-16T00:00:00Z"
}

Status: 403 Forbidden


Requesting Limit Increases

For limit increases, contact support with:

  1. Current organization ID
  2. Requested limits
  3. Use case justification
  4. Expected usage patterns

Enterprise customers can negotiate custom limits.