Rate Limiting Strategies for Payment APIs

Payment API failures during high-traffic periods can cost businesses thousands in lost revenue within minutes. When your checkout flow hits Stripe's rate limits during a flash sale or product launch, customers abandon their carts and your conversion rates plummet. The difference between a successful high-traffic event and a costly outage often comes down to how well you've implemented rate limiting strategies.

Rate limiting isn't just about staying within API quotas—it's about building resilient payment systems that gracefully handle traffic spikes while maintaining optimal performance. This guide covers practical rate limiting strategies specifically for payment APIs, with real-world implementation examples and the edge cases that can break your payment flow when you least expect it.

Understanding Payment API Rate Limits

Stripe's Rate Limiting Model

Stripe implements a sophisticated rate limiting system that goes beyond simple request-per-second caps. Their limits operate on multiple dimensions:

// Stripe's rate limits (as of 2024)
const STRIPE_LIMITS = {
  liveMode: {
    reads: 100,  // requests per second
    writes: 100, // requests per second
    connects: 25 // for Connect platforms
  },
  testMode: {
    reads: 25,
    writes: 25,
    connects: 25
  }
};

The key insight is that Stripe differentiates between read operations (retrieving customers, payments) and write operations (creating charges, updating subscriptions). This matters because your rate limiting strategy should account for the operation mix in your application.

Beyond Simple Request Counting

Payment APIs often implement burst allowances and sliding windows. Stripe allows brief bursts above the sustained rate, which means you can handle 150 requests in a single second as long as your average over a longer period stays within limits.

interface RateLimitWindow {
  sustainedRate: number;    // 100 RPS average
  burstCapacity: number;    // 150 RPS peak
  windowSize: number;       // 10 second sliding window
}

This burst capacity is crucial for payment flows because checkout processes often generate multiple API calls in rapid succession—creating a customer, setting up a payment method, and processing the charge.

Client-Side Rate Limiting Implementation

Request Queue with Exponential Backoff

The most effective client-side strategy combines request queuing with intelligent retry logic:

class PaymentAPIClient {
  private queue: Array<QueuedRequest> = [];
  private processing = false;
  private rateLimitWindow = new Map<string, number>();

  async makeRequest<T>(
    endpoint: string, 
    options: RequestOptions
  ): Promise<T> {
    return new Promise((resolve, reject) => {
      this.queue.push({
        endpoint,
        options,
        resolve,
        reject,
        retryCount: 0,
        priority: options.priority || 'normal'
      });
      
      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.processing) return;
    this.processing = true;

    while (this.queue.length > 0) {
      // Sort by priority (payment creation > customer lookup)
      this.queue.sort((a, b) => {
        const priorities = { high: 3, normal: 2, low: 1 };
        return priorities[b.priority] - priorities[a.priority];
      });

      const request = this.queue.shift()!;
      
      try {
        await this.waitForRateLimit();
        const response = await this.executeRequest(request);
        request.resolve(response);
      } catch (error) {
        if (this.isRateLimitError(error) && request.retryCount < 3) {
          request.retryCount++;
          const delay = Math.min(1000 * Math.pow(2, request.retryCount), 10000);
          
          setTimeout(() => {
            this.queue.unshift(request); // Retry at front of queue
          }, delay);
        } else {
          request.reject(error);
        }
      }
    }

    this.processing = false;
  }

  private async waitForRateLimit(): Promise<void> {
    const now = Date.now();
    const windowStart = now - 1000; // 1 second window
    
    // Clean old entries
    for (const [timestamp] of this.rateLimitWindow) {
      if (timestamp < windowStart) {
        this.rateLimitWindow.delete(timestamp.toString());
      }
    }

    if (this.rateLimitWindow.size >= 95) { // Leave buffer
      const oldestRequest = Math.min(...Array.from(this.rateLimitWindow.keys()).map(Number));
      const waitTime = 1000 - (now - oldestRequest);
      await new Promise(resolve => setTimeout(resolve, waitTime));
    }

    this.rateLimitWindow.set(now.toString(), now);
  }
}

Priority-Based Request Handling

Not all payment API calls are equally critical. Implement priority levels that ensure payment processing takes precedence over administrative operations:

enum RequestPriority {
  CRITICAL = 'critical',    // Payment creation, confirmation
  HIGH = 'high',           // Customer creation during checkout
  NORMAL = 'normal',       // General API calls
  LOW = 'low'             // Analytics, reporting
}

const PRIORITY_WEIGHTS = {
  [RequestPriority.CRITICAL]: 1000,
  [RequestPriority.HIGH]: 100,
  [RequestPriority.NORMAL]: 10,
  [RequestPriority.LOW]: 1
};

This ensures that during rate limit pressure, customer-facing payment operations continue while background processes queue appropriately.

Server-Side Rate Limiting Strategies

Distributed Rate Limiting with Redis

For applications handling significant payment volume, implement distributed rate limiting using Redis:

import Redis from 'ioredis';

class DistributedRateLimiter {
  private redis: Redis;
  
  constructor(redisUrl: string) {
    this.redis = new Redis(redisUrl);
  }

  async checkRateLimit(
    key: string, 
    limit: number, 
    windowMs: number
  ): Promise<{ allowed: boolean; resetTime: number; remaining: number }> {
    const script = `
      local key = KEYS[1]
      local limit = tonumber(ARGV[1])
      local window = tonumber(ARGV[2])
      local now = tonumber(ARGV[3])
      
      -- Clean expired entries
      redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
      
      -- Count current requests
      local current = redis.call('ZCARD', key)
      
      if current < limit then
        -- Add current request
        redis.call('ZADD', key, now, now)
        redis.call('EXPIRE', key, math.ceil(window / 1000))
        return {1, now + window - (current * (window / limit)), limit - current - 1}
      else
        -- Calculate reset time
        local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')[2]
        return {0, oldest + window, 0}
      end
    `;

    const result = await this.redis.eval(
      script,
      1,
      key,
      limit.toString(),
      windowMs.toString(),
      Date.now().toString()
    ) as [number, number, number];

    return {
      allowed: result[0] === 1,
      resetTime: result[1],
      remaining: result[2]
    };
  }
}

// Usage in payment endpoint
app.post('/api/payments', async (req, res) => {
  const rateLimitKey = `payment_api:${req.ip}`;
  const { allowed, resetTime, remaining } = await rateLimiter.checkRateLimit(
    rateLimitKey,
    100, // 100 requests
    60000 // per minute
  );

  if (!allowed) {
    return res.status(429).json({
      error: 'Rate limit exceeded',
      resetTime,
      retryAfter: Math.ceil((resetTime - Date.now()) / 1000)
    });
  }

  res.set('X-RateLimit-Remaining', remaining.toString());
  res.set('X-RateLimit-Reset', new Date(resetTime).toISOString());

  // Process payment...
});

Adaptive Rate Limiting Based on API Response

Implement dynamic rate limiting that adjusts based on upstream API responses:

class AdaptiveRateLimiter {
  private currentLimit = 100; // Start with Stripe's default
  private errorWindow: number[] = [];
  private readonly maxErrors = 5;
  private readonly windowMs = 60000;

  async processRequest<T>(request: () => Promise<T>): Promise<T> {
    try {
      const result = await request();
      this.onSuccess();
      return result;
    } catch (error) {
      if (this.isRateLimitError(error)) {
        this.onRateLimit(error);
        throw error;
      }
      throw error;
    }
  }

  private onSuccess() {
    // Gradually increase limit if we're consistently successful
    if (this.currentLimit < 100) {
      this.currentLimit = Math.min(this.currentLimit + 1, 100);
    }
  }

  private onRateLimit(error: any) {
    const now = Date.now();
    
    // Add to error window
    this.errorWindow.push(now);
    this.errorWindow = this.errorWindow.filter(
      time => now - time < this.windowMs
    );

    // Reduce limit if we're hitting rate limits frequently
    if (this.errorWindow.length >= this.maxErrors) {
      this.currentLimit = Math.max(this.currentLimit * 0.5, 10);
      this.errorWindow = []; // Reset window
    }

    // Use Stripe's retry-after header if available
    const retryAfter = error.headers?.['retry-after'];
    if (retryAfter) {
      const delay = parseInt(retryAfter) * 1000;
      return new Promise(resolve => setTimeout(resolve, delay));
    }
  }

  getCurrentLimit(): number {
    return Math.floor(this.currentLimit);
  }
}

Handling Webhook Rate Limits

Webhooks present unique rate limiting challenges because they're initiated by the payment provider, not your application. However, your webhook processing can still hit rate limits when making API calls back to Stripe.

Webhook Queue with Dead Letter Processing

import { Queue, Worker } from 'bullmq';

class WebhookProcessor {
  private webhookQueue: Queue;
  private deadLetterQueue: Queue;

  constructor(redisConnection: any) {
    this.webhookQueue = new Queue('webhook-processing', {
      connection: redisConnection,
      defaultJobOptions: {
        attempts: 3,
        backoff: {
          type: 'exponential',
          delay: 2000,
        },
        removeOnComplete: 100,
        removeOnFail: 50,
      }
    });

    this.deadLetterQueue = new Queue('webhook-dead-letter', {
      connection: redisConnection
    });

    this.setupWorker();
  }

  async processWebhook(event: Stripe.Event) {
    await this.webhookQueue.add('process-webhook', {
      eventId: event.id,
      type: event.type,
      data: event.data,
      created: event.created
    }, {
      priority: this.getEventPriority(event.type),
      delay: this.calculateDelay(event.type)
    });
  }

  private setupWorker() {
    const worker = new Worker('webhook-processing', async (job) => {
      const { eventId, type, data } = job.data;
      
      try {
        await this.processEventWithRateLimit(type, data);
      } catch (error) {
        if (this.isRateLimitError(error)) {
          // Re-queue with exponential backoff
          throw new Error(`Rate limited: ${error.message}`);
        }
        
        // Move to dead letter queue for manual inspection
        await this.deadLetterQueue.add('failed-webhook', {
          originalJob: job.data,
          error: error.message,
          failedAt: new Date().toISOString()
        });
        
        throw error;
      }
    }, {
      connection: this.webhookQueue.connection,
      concurrency: 5, // Process 5 webhooks concurrently
      limiter: {
        max: 50, // Maximum 50 jobs per interval
        duration: 60000, // Per minute
      }
    });
  }

  private getEventPriority(eventType: string): number {
    const priorities = {
      'payment_intent.succeeded': 100,
      'invoice.payment_failed': 90,
      'customer.subscription.deleted': 80,
      'invoice.created': 20,
    };
    
    return priorities[eventType] || 50;
  }

  private calculateDelay(eventType: string): number {
    // Delay non-critical events to spread load
    const nonCritical = [
      'invoice.created',
      'customer.updated',
      'payment_method.attached'
    ];
    
    return nonCritical.includes(eventType) ? 
      Math.random() * 30000 : 0; // Random delay up to 30s
  }
}

Common Pitfalls and Edge Cases

The Compound Rate Limit Problem

One subtle issue occurs when your application makes multiple API calls for a single user action. Consider a subscription upgrade flow:

// This pattern can quickly exhaust rate limits
async function upgradeSubscription(customerId: string, newPriceId: string) {
  // Call 1: Retrieve customer
  const customer = await stripe.customers.retrieve(customerId);
  
  // Call 2: Retrieve current subscription
  const subscriptions = await stripe.subscriptions.list({
    customer: customerId,
    status: 'active'
  });
  
  // Call 3: Update subscription
  const subscription = await stripe.subscriptions.update(
    subscriptions.data[0].id,
    { items: [{ price: newPriceId }] }
  );
  
  // Call 4: Retrieve updated subscription with expanded data
  return await stripe.subscriptions.retrieve(subscription.id, {
    expand: ['latest_invoice', 'customer']
  });
}

This function makes 4 API calls for a single user action. During high traffic, these compound calls can quickly exhaust your rate limit budget.

Solution: Batch Operations and Caching

class OptimizedSubscriptionService {
  private customerCache = new Map<string, { data: Stripe.Customer; expires: number }>();
  
  async upgradeSubscription(customerId: string, newPriceId: string) {
    // Use cached customer data if available
    let customer = this.getCachedCustomer(customerId);
    
    if (!customer) {
      customer = await stripe.customers.retrieve(customerId);
      this.cacheCustomer(customerId, customer);
    }
    
    // Combine subscription list and update in fewer calls
    const subscriptions = await stripe.subscriptions.list({
      customer: customerId,
      status: 'active',
      limit: 1 // We only need the active one
    });
    
    if (subscriptions.data.length === 0) {
      throw new Error('No active subscription found');
    }
    
    // Update and retrieve in one call using expand
    return await stripe.subscriptions.update(
      subscriptions.data[0].id,
      { 
        items: [{ price: newPriceId }],
        expand: ['latest_invoice', 'customer'] // Get expanded data immediately
      }
    );
  }
  
  private getCachedCustomer(customerId: string): Stripe.Customer | null {
    const cached = this.customerCache.get(customerId);
    if (cached && cached.expires > Date.now()) {
      return cached.data;
    }
    return null;
  }
  
  private cacheCustomer(customerId: string, customer: Stripe.Customer) {
    this.customerCache.set(customerId, {
      data: customer,
      expires: Date.now() + 300000 // Cache for 5 minutes
    });
  }
}

Rate Limit Inheritance in Multi-Tenant Applications

In SaaS applications using Stripe Connect, rate limits can be inherited in unexpected ways:

// Problematic: All requests count against your platform's rate limit
const payment = await stripe.paymentIntents.create({
  amount: 2000,
  currency: 'usd',
  application_fee_amount: 200,
}, {
  stripeAccount: 'acct_connected_account_id' // Connected account
});

Even though you're creating a payment for a connected account, this request counts against your platform's rate limit, not the connected account's limit.

Solution: Distribute Load Across Accounts

class ConnectRateLimiter {
  private accountLimits = new Map<string, RateLimitTracker>();
  
  async createPayment(
    paymentData: PaymentIntentCreateParams,
    connectedAccountId: string
  ) {
    const accountTracker = this.getAccountTracker(connectedAccountId);
    
    // Check if connected account has capacity
    if (await accountTracker.hasCapacity()) {
      // Use connected account's rate limit
      return await stripe.paymentIntents.create(paymentData, {
        stripeAccount: connectedAccountId
      });
    } else {
      // Fall back to platform account with queuing
      return await this.queueForPlatformAccount(paymentData);
    }
  }
  
  private getAccountTracker(accountId: string): RateLimitTracker {
    if (!this.accountLimits.has(accountId)) {
      this.accountLimits.set(accountId, new RateLimitTracker(100)); // 100 RPS per account
    }
    return this.accountLimits.get(accountId)!;
  }
}

Best Practices Summary

Rate Limiting Checklist

Implement client-side queuing with exponential backoff for all payment API calls
Use priority-based request handling to ensure critical payment operations proceed first
Cache frequently accessed data (customers, products) to reduce API calls
Monitor rate limit headers and adjust your request patterns dynamically
Implement circuit breakers to prevent cascade failures during rate limit events
Use webhook queuing with dead letter processing for reliable event handling
Test rate limit scenarios in staging environments before high-traffic events

Monitoring and Alerting

Set up monitoring for rate limit metrics:

const rateLimitMetrics = {
  requestsPerSecond: new Counter('payment_api_requests_per_second'),
  rateLimitHits: new Counter('payment_api_rate_limit_hits'),
  queueDepth: new Gauge('payment_api_queue_depth'),
  averageResponseTime: new Histogram('payment_api_response_time')
};

// Alert when approaching 80% of rate limit
if (currentRPS > STRIPE_LIMITS.liveMode.writes * 0.8) {
  alerting.send('Approaching Stripe rate limit', {
    currentRPS,
    limit: STRIPE_LIMITS.liveMode.writes,
    queueDepth: paymentQueue.length
  });
}

Effective rate limiting for payment APIs requires more than just counting requests—it demands understanding your application's payment flow patterns, implementing intelligent queuing strategies, and preparing for the edge cases that can break your system during critical moments. The strategies outlined here will help you build payment systems that remain responsive and reliable even under extreme load.

If you're implementing a new Stripe integration or need to audit your existing rate limiting approach, our Stripe integration service includes rate limiting optimization as part of building production-ready payment flows. For complex subscription billing scenarios that generate high API volumes, our Stripe subscriptions service can help architect efficient billing systems that stay within rate limits while scaling with your business.

Understanding Payment API Rate Limits

Stripe's Rate Limiting Model

Stripe implements a sophisticated rate limiting system that goes beyond simple request-per-second caps. Their limits operate on multiple dimensions:

// Stripe's rate limits (as of 2024)
const STRIPE_LIMITS = {
  liveMode: {
    reads: 100,  // requests per second
    writes: 100, // requests per second
    connects: 25 // for Connect platforms
  },
  testMode: {
    reads: 25,
    writes: 25,
    connects: 25
  }
};

Beyond Simple Request Counting

interface RateLimitWindow {
  sustainedRate: number;    // 100 RPS average
  burstCapacity: number;    // 150 RPS peak
  windowSize: number;       // 10 second sliding window
}

Client-Side Rate Limiting Implementation

Request Queue with Exponential Backoff

The most effective client-side strategy combines request queuing with intelligent retry logic:

class PaymentAPIClient {
  private queue: Array<QueuedRequest> = [];
  private processing = false;
  private rateLimitWindow = new Map<string, number>();

  async makeRequest<T>(
    endpoint: string, 
    options: RequestOptions
  ): Promise<T> {
    return new Promise((resolve, reject) => {
      this.queue.push({
        endpoint,
        options,
        resolve,
        reject,
        retryCount: 0,
        priority: options.priority || 'normal'
      });
      
      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.processing) return;
    this.processing = true;

    while (this.queue.length > 0) {
      // Sort by priority (payment creation > customer lookup)
      this.queue.sort((a, b) => {
        const priorities = { high: 3, normal: 2, low: 1 };
        return priorities[b.priority] - priorities[a.priority];
      });

      const request = this.queue.shift()!;
      
      try {
        await this.waitForRateLimit();
        const response = await this.executeRequest(request);
        request.resolve(response);
      } catch (error) {
        if (this.isRateLimitError(error) && request.retryCount < 3) {
          request.retryCount++;
          const delay = Math.min(1000 * Math.pow(2, request.retryCount), 10000);
          
          setTimeout(() => {
            this.queue.unshift(request); // Retry at front of queue
          }, delay);
        } else {
          request.reject(error);
        }
      }
    }

    this.processing = false;
  }

  private async waitForRateLimit(): Promise<void> {
    const now = Date.now();
    const windowStart = now - 1000; // 1 second window
    
    // Clean old entries
    for (const [timestamp] of this.rateLimitWindow) {
      if (timestamp < windowStart) {
        this.rateLimitWindow.delete(timestamp.toString());
      }
    }

    if (this.rateLimitWindow.size >= 95) { // Leave buffer
      const oldestRequest = Math.min(...Array.from(this.rateLimitWindow.keys()).map(Number));
      const waitTime = 1000 - (now - oldestRequest);
      await new Promise(resolve => setTimeout(resolve, waitTime));
    }

    this.rateLimitWindow.set(now.toString(), now);
  }
}

Priority-Based Request Handling

Not all payment API calls are equally critical. Implement priority levels that ensure payment processing takes precedence over administrative operations:

enum RequestPriority {
  CRITICAL = 'critical',    // Payment creation, confirmation
  HIGH = 'high',           // Customer creation during checkout
  NORMAL = 'normal',       // General API calls
  LOW = 'low'             // Analytics, reporting
}

const PRIORITY_WEIGHTS = {
  [RequestPriority.CRITICAL]: 1000,
  [RequestPriority.HIGH]: 100,
  [RequestPriority.NORMAL]: 10,
  [RequestPriority.LOW]: 1
};

This ensures that during rate limit pressure, customer-facing payment operations continue while background processes queue appropriately.

Server-Side Rate Limiting Strategies

Distributed Rate Limiting with Redis

For applications handling significant payment volume, implement distributed rate limiting using Redis:

import Redis from 'ioredis';

class DistributedRateLimiter {
  private redis: Redis;
  
  constructor(redisUrl: string) {
    this.redis = new Redis(redisUrl);
  }

  async checkRateLimit(
    key: string, 
    limit: number, 
    windowMs: number
  ): Promise<{ allowed: boolean; resetTime: number; remaining: number }> {
    const script = `
      local key = KEYS[1]
      local limit = tonumber(ARGV[1])
      local window = tonumber(ARGV[2])
      local now = tonumber(ARGV[3])
      
      -- Clean expired entries
      redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
      
      -- Count current requests
      local current = redis.call('ZCARD', key)
      
      if current < limit then
        -- Add current request
        redis.call('ZADD', key, now, now)
        redis.call('EXPIRE', key, math.ceil(window / 1000))
        return {1, now + window - (current * (window / limit)), limit - current - 1}
      else
        -- Calculate reset time
        local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')[2]
        return {0, oldest + window, 0}
      end
    `;

    const result = await this.redis.eval(
      script,
      1,
      key,
      limit.toString(),
      windowMs.toString(),
      Date.now().toString()
    ) as [number, number, number];

    return {
      allowed: result[0] === 1,
      resetTime: result[1],
      remaining: result[2]
    };
  }
}

// Usage in payment endpoint
app.post('/api/payments', async (req, res) => {
  const rateLimitKey = `payment_api:${req.ip}`;
  const { allowed, resetTime, remaining } = await rateLimiter.checkRateLimit(
    rateLimitKey,
    100, // 100 requests
    60000 // per minute
  );

  if (!allowed) {
    return res.status(429).json({
      error: 'Rate limit exceeded',
      resetTime,
      retryAfter: Math.ceil((resetTime - Date.now()) / 1000)
    });
  }

  res.set('X-RateLimit-Remaining', remaining.toString());
  res.set('X-RateLimit-Reset', new Date(resetTime).toISOString());

  // Process payment...
});

Adaptive Rate Limiting Based on API Response

Implement dynamic rate limiting that adjusts based on upstream API responses:

class AdaptiveRateLimiter {
  private currentLimit = 100; // Start with Stripe's default
  private errorWindow: number[] = [];
  private readonly maxErrors = 5;
  private readonly windowMs = 60000;

  async processRequest<T>(request: () => Promise<T>): Promise<T> {
    try {
      const result = await request();
      this.onSuccess();
      return result;
    } catch (error) {
      if (this.isRateLimitError(error)) {
        this.onRateLimit(error);
        throw error;
      }
      throw error;
    }
  }

  private onSuccess() {
    // Gradually increase limit if we're consistently successful
    if (this.currentLimit < 100) {
      this.currentLimit = Math.min(this.currentLimit + 1, 100);
    }
  }

  private onRateLimit(error: any) {
    const now = Date.now();
    
    // Add to error window
    this.errorWindow.push(now);
    this.errorWindow = this.errorWindow.filter(
      time => now - time < this.windowMs
    );

    // Reduce limit if we're hitting rate limits frequently
    if (this.errorWindow.length >= this.maxErrors) {
      this.currentLimit = Math.max(this.currentLimit * 0.5, 10);
      this.errorWindow = []; // Reset window
    }

    // Use Stripe's retry-after header if available
    const retryAfter = error.headers?.['retry-after'];
    if (retryAfter) {
      const delay = parseInt(retryAfter) * 1000;
      return new Promise(resolve => setTimeout(resolve, delay));
    }
  }

  getCurrentLimit(): number {
    return Math.floor(this.currentLimit);
  }
}

Handling Webhook Rate Limits

Webhook Queue with Dead Letter Processing

import { Queue, Worker } from 'bullmq';

class WebhookProcessor {
  private webhookQueue: Queue;
  private deadLetterQueue: Queue;

  constructor(redisConnection: any) {
    this.webhookQueue = new Queue('webhook-processing', {
      connection: redisConnection,
      defaultJobOptions: {
        attempts: 3,
        backoff: {
          type: 'exponential',
          delay: 2000,
        },
        removeOnComplete: 100,
        removeOnFail: 50,
      }
    });

    this.deadLetterQueue = new Queue('webhook-dead-letter', {
      connection: redisConnection
    });

    this.setupWorker();
  }

  async processWebhook(event: Stripe.Event) {
    await this.webhookQueue.add('process-webhook', {
      eventId: event.id,
      type: event.type,
      data: event.data,
      created: event.created
    }, {
      priority: this.getEventPriority(event.type),
      delay: this.calculateDelay(event.type)
    });
  }

  private setupWorker() {
    const worker = new Worker('webhook-processing', async (job) => {
      const { eventId, type, data } = job.data;
      
      try {
        await this.processEventWithRateLimit(type, data);
      } catch (error) {
        if (this.isRateLimitError(error)) {
          // Re-queue with exponential backoff
          throw new Error(`Rate limited: ${error.message}`);
        }
        
        // Move to dead letter queue for manual inspection
        await this.deadLetterQueue.add('failed-webhook', {
          originalJob: job.data,
          error: error.message,
          failedAt: new Date().toISOString()
        });
        
        throw error;
      }
    }, {
      connection: this.webhookQueue.connection,
      concurrency: 5, // Process 5 webhooks concurrently
      limiter: {
        max: 50, // Maximum 50 jobs per interval
        duration: 60000, // Per minute
      }
    });
  }

  private getEventPriority(eventType: string): number {
    const priorities = {
      'payment_intent.succeeded': 100,
      'invoice.payment_failed': 90,
      'customer.subscription.deleted': 80,
      'invoice.created': 20,
    };
    
    return priorities[eventType] || 50;
  }

  private calculateDelay(eventType: string): number {
    // Delay non-critical events to spread load
    const nonCritical = [
      'invoice.created',
      'customer.updated',
      'payment_method.attached'
    ];
    
    return nonCritical.includes(eventType) ? 
      Math.random() * 30000 : 0; // Random delay up to 30s
  }
}

Common Pitfalls and Edge Cases

The Compound Rate Limit Problem

One subtle issue occurs when your application makes multiple API calls for a single user action. Consider a subscription upgrade flow:

// This pattern can quickly exhaust rate limits
async function upgradeSubscription(customerId: string, newPriceId: string) {
  // Call 1: Retrieve customer
  const customer = await stripe.customers.retrieve(customerId);
  
  // Call 2: Retrieve current subscription
  const subscriptions = await stripe.subscriptions.list({
    customer: customerId,
    status: 'active'
  });
  
  // Call 3: Update subscription
  const subscription = await stripe.subscriptions.update(
    subscriptions.data[0].id,
    { items: [{ price: newPriceId }] }
  );
  
  // Call 4: Retrieve updated subscription with expanded data
  return await stripe.subscriptions.retrieve(subscription.id, {
    expand: ['latest_invoice', 'customer']
  });
}

This function makes 4 API calls for a single user action. During high traffic, these compound calls can quickly exhaust your rate limit budget.

Solution: Batch Operations and Caching

class OptimizedSubscriptionService {
  private customerCache = new Map<string, { data: Stripe.Customer; expires: number }>();
  
  async upgradeSubscription(customerId: string, newPriceId: string) {
    // Use cached customer data if available
    let customer = this.getCachedCustomer(customerId);
    
    if (!customer) {
      customer = await stripe.customers.retrieve(customerId);
      this.cacheCustomer(customerId, customer);
    }
    
    // Combine subscription list and update in fewer calls
    const subscriptions = await stripe.subscriptions.list({
      customer: customerId,
      status: 'active',
      limit: 1 // We only need the active one
    });
    
    if (subscriptions.data.length === 0) {
      throw new Error('No active subscription found');
    }
    
    // Update and retrieve in one call using expand
    return await stripe.subscriptions.update(
      subscriptions.data[0].id,
      { 
        items: [{ price: newPriceId }],
        expand: ['latest_invoice', 'customer'] // Get expanded data immediately
      }
    );
  }
  
  private getCachedCustomer(customerId: string): Stripe.Customer | null {
    const cached = this.customerCache.get(customerId);
    if (cached && cached.expires > Date.now()) {
      return cached.data;
    }
    return null;
  }
  
  private cacheCustomer(customerId: string, customer: Stripe.Customer) {
    this.customerCache.set(customerId, {
      data: customer,
      expires: Date.now() + 300000 // Cache for 5 minutes
    });
  }
}

Rate Limit Inheritance in Multi-Tenant Applications

In SaaS applications using Stripe Connect, rate limits can be inherited in unexpected ways:

// Problematic: All requests count against your platform's rate limit
const payment = await stripe.paymentIntents.create({
  amount: 2000,
  currency: 'usd',
  application_fee_amount: 200,
}, {
  stripeAccount: 'acct_connected_account_id' // Connected account
});

Even though you're creating a payment for a connected account, this request counts against your platform's rate limit, not the connected account's limit.

Solution: Distribute Load Across Accounts

class ConnectRateLimiter {
  private accountLimits = new Map<string, RateLimitTracker>();
  
  async createPayment(
    paymentData: PaymentIntentCreateParams,
    connectedAccountId: string
  ) {
    const accountTracker = this.getAccountTracker(connectedAccountId);
    
    // Check if connected account has capacity
    if (await accountTracker.hasCapacity()) {
      // Use connected account's rate limit
      return await stripe.paymentIntents.create(paymentData, {
        stripeAccount: connectedAccountId
      });
    } else {
      // Fall back to platform account with queuing
      return await this.queueForPlatformAccount(paymentData);
    }
  }
  
  private getAccountTracker(accountId: string): RateLimitTracker {
    if (!this.accountLimits.has(accountId)) {
      this.accountLimits.set(accountId, new RateLimitTracker(100)); // 100 RPS per account
    }
    return this.accountLimits.get(accountId)!;
  }
}

Best Practices Summary

Rate Limiting Checklist

Implement client-side queuing with exponential backoff for all payment API calls
Use priority-based request handling to ensure critical payment operations proceed first
Cache frequently accessed data (customers, products) to reduce API calls
Monitor rate limit headers and adjust your request patterns dynamically
Implement circuit breakers to prevent cascade failures during rate limit events
Use webhook queuing with dead letter processing for reliable event handling
Test rate limit scenarios in staging environments before high-traffic events

Monitoring and Alerting

Set up monitoring for rate limit metrics:

const rateLimitMetrics = {
  requestsPerSecond: new Counter('payment_api_requests_per_second'),
  rateLimitHits: new Counter('payment_api_rate_limit_hits'),
  queueDepth: new Gauge('payment_api_queue_depth'),
  averageResponseTime: new Histogram('payment_api_response_time')
};

// Alert when approaching 80% of rate limit
if (currentRPS > STRIPE_LIMITS.liveMode.writes * 0.8) {
  alerting.send('Approaching Stripe rate limit', {
    currentRPS,
    limit: STRIPE_LIMITS.liveMode.writes,
    queueDepth: paymentQueue.length
  });
}

Rate Limiting Strategies for Payment APIs

Understanding Payment API Rate Limits

Stripe's Rate Limiting Model

Beyond Simple Request Counting

Client-Side Rate Limiting Implementation

Request Queue with Exponential Backoff

Priority-Based Request Handling

Server-Side Rate Limiting Strategies

Distributed Rate Limiting with Redis

Adaptive Rate Limiting Based on API Response

Handling Webhook Rate Limits

Webhook Queue with Dead Letter Processing

Common Pitfalls and Edge Cases

The Compound Rate Limit Problem

Rate Limit Inheritance in Multi-Tenant Applications

Best Practices Summary

Rate Limiting Checklist

Monitoring and Alerting

Related Articles

Need Expert Implementation?

Rate Limiting Strategies for Payment APIs

Understanding Payment API Rate Limits

Stripe's Rate Limiting Model

Beyond Simple Request Counting

Client-Side Rate Limiting Implementation

Request Queue with Exponential Backoff

Priority-Based Request Handling

Server-Side Rate Limiting Strategies

Distributed Rate Limiting with Redis

Adaptive Rate Limiting Based on API Response

Handling Webhook Rate Limits

Webhook Queue with Dead Letter Processing

Common Pitfalls and Edge Cases

The Compound Rate Limit Problem

Rate Limit Inheritance in Multi-Tenant Applications

Best Practices Summary

Rate Limiting Checklist

Monitoring and Alerting

Related Articles

Need Expert Implementation?