PushBackLog

Rate Limiting & Throttling

Soft enforcement Complete by PushBackLog team
Topic: security Topic: performance Topic: reliability Skillset: backend Skillset: devops Technology: generic Stage: execution Stage: review

Rate Limiting & Throttling

Status: Complete
Category: Security
Default enforcement: Soft
Author: PushBackLog team


Tags

  • Topic: security, performance, reliability
  • Skillset: backend, devops
  • Technology: generic
  • Stage: execution, review

Summary

Rate limiting restricts how many requests a client can make to an API within a time window. It protects against denial-of-service attacks, credential stuffing, brute-force attempts, and abusive behaviour that degrades service quality for other users. Throttling is the related practice of degrading service quality gracefully when limits are reached, rather than failing hard. Both are standard requirements for any API exposed to the internet and for high-value internal operations.


Rationale

Unprotected endpoints are trivially abusable

Any endpoint accessible from the internet will eventually receive automated traffic — scrapers, credential stuffing bots, fuzzing tools, and denial-of-service attempts. Without rate limiting, a login endpoint accepts unlimited password guesses. A search endpoint accepts unlimited queries from a single scraper. A password reset endpoint can be used to enumerate valid email addresses. A payment endpoint can be tested with thousands of stolen card numbers. These are not hypothetical — they are routine attack patterns.

Rate limiting protects both security and service quality

Rate limiting serves two distinct purposes. Its security role is to limit attempted abuse. Its reliability role is to prevent any single client from consuming so much capacity that other clients are degraded. Both justify the implementation, and the same mechanism serves both.


Guidance

Rate limiting strategies

StrategyHow it worksBest for
Fixed windowCount N requests per window (e.g., 100/minute); counter resets at window boundarySimple to implement; vulnerable to burst at window boundary
Sliding windowCount requests over the last N seconds continuouslyMore accurate; eliminates boundary burst
Token bucketTokens refill at a fixed rate; requests consume tokens; allows bursts up to bucket sizeNatural burst allowance with average rate enforcement
Leaky bucketRequests enter a queue and are processed at a fixed rate, regardless of arrival rateSmooths burst traffic; adds latency

Token bucket is the most commonly recommended algorithm for API rate limiting.

Implementation in Node.js (express-rate-limit)

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';

// General API rate limit
const apiLimiter = rateLimit({
  windowMs: 60 * 1000,       // 1 minute
  max: 100,                   // 100 requests per window per IP
  standardHeaders: true,      // Include RateLimit-* headers in responses
  legacyHeaders: false,
  store: new RedisStore({ client: redis }), // Redis for distributed rate limiting
  message: { error: 'Too many requests. Please retry after a minute.' },
});

// Stricter limit for authentication endpoints
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 10,                    // 10 login attempts per 15-minute window
  store: new RedisStore({ client: redis }),
  skipSuccessfulRequests: true, // Only count failed attempts
});

app.use('/api', apiLimiter);
app.use('/api/auth/login', authLimiter);
app.use('/api/auth/password-reset', authLimiter);

Rate limit by identity, not just IP

IP-based rate limiting is easy to bypass with IP rotation. For authenticated endpoints, limit by user ID:

const authenticatedLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 200,
  keyGenerator: (req) => {
    // Rate limit by user ID for authenticated requests
    return req.user?.id ?? req.ip;
  },
  store: new RedisStore({ client: redis }),
});

Response headers

Communicate rate limit status to well-behaved clients:

HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 73
RateLimit-Reset: 1746047160

On limit exceeded:

HTTP/1.1 429 Too Many Requests
Retry-After: 47
Content-Type: application/json

{"error": "Rate limit exceeded. Retry after 47 seconds."}

Tiered rate limits by client category

const limiterByTier = (req: Request, res: Response, next: NextFunction) => {
  const tier = req.user?.subscriptionTier ?? 'anonymous';

  const limits = {
    anonymous:  { max: 20,   windowMs: 60_000 },
    free:       { max: 100,  windowMs: 60_000 },
    pro:        { max: 1000, windowMs: 60_000 },
    enterprise: { max: 10000, windowMs: 60_000 },
  };

  return rateLimit({ ...limits[tier], store: redisStore })(req, res, next);
};

Critical endpoints requiring rate limiting

  • Authentication (login, OAuth callbacks)
  • Password reset / magic link generation
  • OTP/2FA verification
  • Account creation / registration
  • Any financial operation (payment, withdrawal)
  • Any endpoint that sends email or SMS
  • Search endpoints that are computationally expensive
  • File upload endpoints

Design checklist

  • All public-facing endpoints have rate limits appropriate to their function
  • Authentication endpoints have stricter limits than general API endpoints
  • Rate limiting uses distributed state (Redis) — not in-process memory — in multi-instance deployments
  • Rate limit responses use HTTP 429 with Retry-After header
  • Rate limit headers are returned on all responses for client-side backoff
  • Rate limiting cannot be bypassed by rotating IP addresses for authenticated endpoints