Client-side API rate limiter

typescript

A production-ready TypeScript RateLimiter class that enforces rate limits on API calls by queuing excess requests and automatically processing them when the rate limit window resets.

rate-limitingapiasync

Code

typescript

interface QueuedRequest<T> {
  execute: () => Promise<T>;
  resolve: (value: T) => void;
  reject: (error: Error) => void;
}

interface RateLimiterOptions {
  maxRequests: number;
  intervalMs: number;
}

class RateLimiter {
  private readonly maxRequests: number;
  private readonly intervalMs: number;
  private requestTimestamps: number[] = [];
  private queue: QueuedRequest<unknown>[] = [];
  private processingTimer: ReturnType<typeof setTimeout> | null = null;

  constructor(options: RateLimiterOptions) {
    this.maxRequests = options.maxRequests;
    this.intervalMs = options.intervalMs;
  }

  async execute<T>(fn: () => Promise<T>): Promise<T> {
    return new Promise<T>((resolve, reject) => {
      const queuedRequest: QueuedRequest<T> = {
        execute: fn,
        resolve,
        reject,
      };
      this.queue.push(queuedRequest as QueuedRequest<unknown>);
      this.processQueue();
    });
  }

  private cleanupOldTimestamps(): void {
    const now = Date.now();
    const windowStart = now - this.intervalMs;
    this.requestTimestamps = this.requestTimestamps.filter(
      (timestamp) => timestamp > windowStart
    );
  }

  private getAvailableSlots(): number {
    this.cleanupOldTimestamps();
    return this.maxRequests - this.requestTimestamps.length;
  }

  private getTimeUntilNextSlot(): number {
    if (this.requestTimestamps.length === 0) {
      return 0;
    }
    const oldestTimestamp = Math.min(...this.requestTimestamps);
    const timeUntilExpiry = oldestTimestamp + this.intervalMs - Date.now();
    return Math.max(0, timeUntilExpiry);
  }

  private async processQueue(): Promise<void> {
    if (this.queue.length === 0) {
      return;
    }

    const availableSlots = this.getAvailableSlots();

    if (availableSlots > 0) {
      const requestsToProcess = this.queue.splice(0, availableSlots);
      const now = Date.now();

      for (const request of requestsToProcess) {
        this.requestTimestamps.push(now);
        this.executeRequest(request);
      }
    }

    if (this.queue.length > 0 && this.processingTimer === null) {
      const waitTime = this.getTimeUntilNextSlot();
      this.processingTimer = setTimeout(() => {
        this.processingTimer = null;
        this.processQueue();
      }, waitTime + 10);
    }
  }

  private async executeRequest<T>(request: QueuedRequest<T>): Promise<void> {
    try {
      const result = await request.execute();
      request.resolve(result);
    } catch (error) {
      request.reject(error instanceof Error ? error : new Error(String(error)));
    }
  }

  getQueueLength(): number {
    return this.queue.length;
  }

  getActiveRequestCount(): number {
    this.cleanupOldTimestamps();
    return this.requestTimestamps.length;
  }

  clearQueue(): void {
    const error = new Error("Rate limiter queue cleared");
    for (const request of this.queue) {
      request.reject(error);
    }
    this.queue = [];
    if (this.processingTimer !== null) {
      clearTimeout(this.processingTimer);
      this.processingTimer = null;
    }
  }
}

// Usage example
const apiLimiter = new RateLimiter({
  maxRequests: 5,
  intervalMs: 1000,
});

async function fetchUserData(userId: number): Promise<{ id: number; name: string }> {
  return apiLimiter.execute(async () => {
    const response = await fetch(`https://jsonplaceholder.typicode.com/users/${userId}`);
    if (!response.ok) {
      throw new Error(`HTTP error: ${response.status}`);
    }
    return response.json();
  });
}

// Demonstrate rate limiting with multiple concurrent requests
async function demo(): Promise<void> {
  console.log("Starting 10 requests with limit of 5 per second...");
  const startTime = Date.now();

  const requests = Array.from({ length: 10 }, (_, i) =>
    fetchUserData(i + 1)
      .then((user) => {
        console.log(`Request ${i + 1} completed at ${Date.now() - startTime}ms: ${user.name}`);
        return user;
      })
      .catch((error) => {
        console.error(`Request ${i + 1} failed:`, error.message);
        return null;
      })
  );

  await Promise.all(requests);
  console.log(`All requests completed in ${Date.now() - startTime}ms`);
}

demo();

How It Works

This RateLimiter class implements a sliding window rate limiting algorithm that's ideal for client-side API throttling. The core mechanism tracks timestamps of recent requests and uses a queue to hold excess requests until capacity becomes available. When you call execute(), the request is wrapped in a Promise and added to the queue, then the queue processor determines if there's available capacity based on how many requests occurred within the current time window.

The sliding window approach (as opposed to fixed windows) provides smoother rate limiting. Instead of resetting a counter at fixed intervals, the limiter continuously tracks individual request timestamps and removes them as they age out of the window. This prevents the "burst at window boundaries" problem where a fixed window could allow 2x the limit if requests cluster at the end of one window and start of the next.

The queue processing logic is carefully designed to handle concurrency. When requests are queued, processQueue() is called which checks available slots by cleaning up expired timestamps first. If slots are available, it immediately processes those requests. If the queue still has items but no slots are available, it sets a timer to retry when the oldest request expires. The timer includes a small buffer (10ms) to account for timing precision issues in JavaScript's event loop.

Error handling is comprehensive: individual request failures don't affect other queued requests, and the clearQueue() method properly rejects all pending promises when you need to abort. The class also exposes getQueueLength() and getActiveRequestCount() for monitoring, which is valuable for displaying loading states or debugging rate limit issues in production.

Use this pattern when integrating with third-party APIs that have strict rate limits, especially from browser environments where you can't rely on server-side queuing. Avoid using this as your only rate limiting solution for authenticated APIs—always implement server-side limits too, since client-side code can be bypassed. Also consider that this implementation doesn't persist queue state, so page refreshes will reset the limiter and potentially allow burst requests.