Rate Limits - AI CSR API

Current Rate Limits

Rate limits protect against abuse and ensure system stability.

Endpoint	Limit	Identifier	Window
`/api/v1/*` (All API endpoints)	1,000 requests	Per API key	Per hour
`/api/health` (Health check)	60 requests	Per IP address	Per minute

How It Works

API Endpoints (/api/v1/*): Rate limited by API key after authentication
Health Check (/api/health): Rate limited by IP address before any processing
Algorithm: Sliding window (distributed evenly across the time period)

Rate Limit Headers

Every API response includes rate limit information in the headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640000000

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when the limit resets

Rate Limit Exceeded Response

When you exceed the rate limit, you’ll receive a 429 status code:

{
  "success": false,
  "data": null,
  "error": "Rate limit exceeded. Please try again later.",
  "limit": 1000,
  "remaining": 0,
  "reset": "2025-10-29T16:30:00.000Z"
}

The response also includes a Retry-After header with seconds until you can retry:

Retry-After: 3600

Best Practices

1. Monitor Rate Limit Headers

JavaScript

const response = await fetch('https://YOUR_DEPLOYMENT.lupitor.com/api/v1/leads', {
  headers: { 'x-api-key': 'YOUR_API_KEY' }
});

const limit = response.headers.get('X-RateLimit-Limit');
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

console.log(`${remaining}/${limit} requests remaining`);

if (parseInt(remaining) < 10) {
  console.warn('Approaching rate limit!');
}

2. Implement Exponential Backoff

Python

import time
import requests

def make_request_with_backoff(url, headers, data, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=data)

        if response.status_code != 429:
            return response

        # Calculate wait time: 2^attempt seconds
        wait_time = 2 ** attempt
        print(f"Rate limited. Waiting {wait_time} seconds...")
        time.sleep(wait_time)

    raise Exception("Max retries exceeded")

3. Respect Rate Limit Reset Time

Python

import time

def wait_for_rate_limit_reset(response):
    """Wait until rate limit resets based on header"""
    if response.status_code == 429:
        reset_time = int(response.headers.get('X-RateLimit-Reset', 0))
        current_time = int(time.time())

        if reset_time > current_time:
            wait_seconds = reset_time - current_time + 1
            print(f"Rate limit exceeded. Waiting {wait_seconds}s...")
            time.sleep(wait_seconds)
            return True

    return False

4. Batch Requests Efficiently

Instead of making 1000 individual requests, use bulk endpoints:

// ❌ This will hit rate limits quickly
for (const lead of leads) {
  await fetch('https://YOUR_DEPLOYMENT.lupitor.com/api/v1/leads', {
    method: 'POST',
    headers: { 'x-api-key': 'YOUR_API_KEY' },
    body: JSON.stringify(lead)
  });
}

// ✅ Much more efficient
const response = await fetch('https://YOUR_DEPLOYMENT.lupitor.com/api/v1/leads/bulk', {
  method: 'POST',
  headers: { 'x-api-key': 'YOUR_API_KEY' },
  body: JSON.stringify({
    campaignId: 'k978...',
    leads: leads  // Up to 1000 at once
  })
});

5. Distribute Load Over Time

Python

import time

def distribute_requests(items, requests_per_minute=100):
    """Spread requests evenly to avoid bursts"""
    delay = 60 / requests_per_minute  # seconds between requests

    for item in items:
        make_request(item)
        time.sleep(delay)  # 0.6 seconds for 100/min

Advanced Rate Limiting Strategies

Request Queue with Rate Limiting

JavaScript

class RateLimitedQueue {
  constructor(requestsPerMinute = 100) {
    this.queue = [];
    this.processing = false;
    this.requestsPerMinute = requestsPerMinute;
    this.delay = 60000 / requestsPerMinute; // ms between requests
  }

  async add(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }

  async process() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;

    while (this.queue.length > 0) {
      const { requestFn, resolve, reject } = this.queue.shift();

      try {
        const result = await requestFn();
        resolve(result);
      } catch (error) {
        if (error.status === 429) {
          // Re-queue on rate limit
          this.queue.unshift({ requestFn, resolve, reject });
          await new Promise(r => setTimeout(r, 60000)); // Wait 1 minute
          continue;
        }
        reject(error);
      }

      // Wait before next request
      if (this.queue.length > 0) {
        await new Promise(r => setTimeout(r, this.delay));
      }
    }

    this.processing = false;
  }
}

// Usage
const queue = new RateLimitedQueue(100);

for (const lead of leads) {
  queue.add(() => createLead(lead));
}

Need Higher Limits?

If your use case requires higher rate limits:

Assess Your Needs

Document your expected request volume and use case

Email support@lupitor.com with your requirements

Review Options

We’ll work with you to find the right limits for your needs

Upgrade

Higher limits may be available on enterprise plans

Common Scenarios

High-Volume Lead Import

If importing thousands of leads:

Use bulk endpoint (POST /leads/bulk) - 1000 leads per request
Batch your imports - 10 bulk requests per minute = 10,000 leads/min
Run during off-hours if possible
Monitor progress and handle errors gracefully

Real-Time Integrations

For webhook-triggered lead creation:

Queue incoming webhooks instead of immediate API calls
Process queue at controlled rate (90-95 requests/min)
Leave headroom for retries and other operations
Implement circuit breakers for cascading failures

Analytics Dashboards

For dashboards that query the API:

Cache responses for at least 1 minute
Aggregate data on the backend
Use webhooks for real-time updates instead of polling
Implement request debouncing for user interactions

Monitoring Rate Limits

Track your usage to avoid surprises:

Python

class RateLimitMonitor:
    def __init__(self):
        self.requests = []

    def log_request(self, response):
        self.requests.append({
            'timestamp': time.time(),
            'remaining': int(response.headers.get('X-RateLimit-Remaining', 0)),
            'limit': int(response.headers.get('X-RateLimit-Limit', 100))
        })

        # Keep only last hour
        cutoff = time.time() - 3600
        self.requests = [r for r in self.requests if r['timestamp'] > cutoff]

    def get_usage_stats(self):
        if not self.requests:
            return None

        recent = [r for r in self.requests if r['timestamp'] > time.time() - 60]
        return {
            'requests_last_minute': len(recent),
            'current_remaining': self.requests[-1]['remaining'],
            'current_limit': self.requests[-1]['limit']
        }

​Current Rate Limits

​How It Works

​Rate Limit Headers

​Rate Limit Exceeded Response

​Best Practices

​1. Monitor Rate Limit Headers

​2. Implement Exponential Backoff

​3. Respect Rate Limit Reset Time

​4. Batch Requests Efficiently

​5. Distribute Load Over Time

​Advanced Rate Limiting Strategies

​Request Queue with Rate Limiting

​Need Higher Limits?

​Common Scenarios

​High-Volume Lead Import

​Real-Time Integrations

​Analytics Dashboards

​Monitoring Rate Limits

Current Rate Limits

How It Works

Rate Limit Headers

Rate Limit Exceeded Response

Best Practices

1. Monitor Rate Limit Headers

2. Implement Exponential Backoff

3. Respect Rate Limit Reset Time

4. Batch Requests Efficiently

5. Distribute Load Over Time

Advanced Rate Limiting Strategies

Request Queue with Rate Limiting

Need Higher Limits?

Common Scenarios

High-Volume Lead Import

Real-Time Integrations

Analytics Dashboards

Monitoring Rate Limits