Rate Limiting & Throttling

Status: Complete
Category: Security
Default enforcement: Soft
Author: PushBackLog team

Summary

Rate limiting restricts how many requests a client can make to an API within a time window. It protects against denial-of-service attacks, credential stuffing, brute-force attempts, and abusive behaviour that degrades service quality for other users. Throttling is the related practice of degrading service quality gracefully when limits are reached, rather than failing hard. Both are standard requirements for any API exposed to the internet and for high-value internal operations.

Rationale

Unprotected endpoints are trivially abusable

Any endpoint accessible from the internet will eventually receive automated traffic — scrapers, credential stuffing bots, fuzzing tools, and denial-of-service attempts. Without rate limiting, a login endpoint accepts unlimited password guesses. A search endpoint accepts unlimited queries from a single scraper. A password reset endpoint can be used to enumerate valid email addresses. A payment endpoint can be tested with thousands of stolen card numbers. These are not hypothetical — they are routine attack patterns.

Rate limiting protects both security and service quality

Rate limiting serves two distinct purposes. Its security role is to limit attempted abuse. Its reliability role is to prevent any single client from consuming so much capacity that other clients are degraded. Both justify the implementation, and the same mechanism serves both.

Guidance

Rate limiting strategies

Strategy	How it works	Best for
Fixed window	Count N requests per window (e.g., 100/minute); counter resets at window boundary	Simple to implement; vulnerable to burst at window boundary
Sliding window	Count requests over the last N seconds continuously	More accurate; eliminates boundary burst
Token bucket	Tokens refill at a fixed rate; requests consume tokens; allows bursts up to bucket size	Natural burst allowance with average rate enforcement
Leaky bucket	Requests enter a queue and are processed at a fixed rate, regardless of arrival rate	Smooths burst traffic; adds latency

Token bucket is the most commonly recommended algorithm for API rate limiting.

Implementation in Node.js (express-rate-limit)

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';

// General API rate limit
const apiLimiter = rateLimit({
  windowMs: 60 * 1000,       // 1 minute
  max: 100,                   // 100 requests per window per IP
  standardHeaders: true,      // Include RateLimit-* headers in responses
  legacyHeaders: false,
  store: new RedisStore({ client: redis }), // Redis for distributed rate limiting
  message: { error: 'Too many requests. Please retry after a minute.' },
});

// Stricter limit for authentication endpoints
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 10,                    // 10 login attempts per 15-minute window
  store: new RedisStore({ client: redis }),
  skipSuccessfulRequests: true, // Only count failed attempts
});

app.use('/api', apiLimiter);
app.use('/api/auth/login', authLimiter);
app.use('/api/auth/password-reset', authLimiter);

Rate limit by identity, not just IP

IP-based rate limiting is easy to bypass with IP rotation. For authenticated endpoints, limit by user ID:

const authenticatedLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 200,
  keyGenerator: (req) => {
    // Rate limit by user ID for authenticated requests
    return req.user?.id ?? req.ip;
  },
  store: new RedisStore({ client: redis }),
});

Response headers

Communicate rate limit status to well-behaved clients:

HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 73
RateLimit-Reset: 1746047160

On limit exceeded:

HTTP/1.1 429 Too Many Requests
Retry-After: 47
Content-Type: application/json

{"error": "Rate limit exceeded. Retry after 47 seconds."}

Tiered rate limits by client category

const limiterByTier = (req: Request, res: Response, next: NextFunction) => {
  const tier = req.user?.subscriptionTier ?? 'anonymous';

  const limits = {
    anonymous:  { max: 20,   windowMs: 60_000 },
    free:       { max: 100,  windowMs: 60_000 },
    pro:        { max: 1000, windowMs: 60_000 },
    enterprise: { max: 10000, windowMs: 60_000 },
  };

  return rateLimit({ ...limits[tier], store: redisStore })(req, res, next);
};

Critical endpoints requiring rate limiting

Authentication (login, OAuth callbacks)
Password reset / magic link generation
OTP/2FA verification
Account creation / registration
Any financial operation (payment, withdrawal)
Any endpoint that sends email or SMS
Search endpoints that are computationally expensive
File upload endpoints

Design checklist

All public-facing endpoints have rate limits appropriate to their function
Authentication endpoints have stricter limits than general API endpoints
Rate limiting uses distributed state (Redis) — not in-process memory — in multi-instance deployments
Rate limit responses use HTTP 429 with Retry-After header
Rate limit headers are returned on all responses for client-side backoff
Rate limiting cannot be bypassed by rotating IP addresses for authenticated endpoints

Rate Limiting & Throttling

Rate Limiting & Throttling

Tags

Summary

Rationale

Unprotected endpoints are trivially abusable

Rate limiting protects both security and service quality

Guidance

Rate limiting strategies

Implementation in Node.js (express-rate-limit)

Rate limit by identity, not just IP

Response headers

Tiered rate limits by client category

Critical endpoints requiring rate limiting

Design checklist