Rate Limiting & Throttling
Status: Complete
Category: Security
Default enforcement: Soft
Author: PushBackLog team
Tags
- Topic: security, performance, reliability
- Skillset: backend, devops
- Technology: generic
- Stage: execution, review
Summary
Rate limiting restricts how many requests a client can make to an API within a time window. It protects against denial-of-service attacks, credential stuffing, brute-force attempts, and abusive behaviour that degrades service quality for other users. Throttling is the related practice of degrading service quality gracefully when limits are reached, rather than failing hard. Both are standard requirements for any API exposed to the internet and for high-value internal operations.
Rationale
Unprotected endpoints are trivially abusable
Any endpoint accessible from the internet will eventually receive automated traffic — scrapers, credential stuffing bots, fuzzing tools, and denial-of-service attempts. Without rate limiting, a login endpoint accepts unlimited password guesses. A search endpoint accepts unlimited queries from a single scraper. A password reset endpoint can be used to enumerate valid email addresses. A payment endpoint can be tested with thousands of stolen card numbers. These are not hypothetical — they are routine attack patterns.
Rate limiting protects both security and service quality
Rate limiting serves two distinct purposes. Its security role is to limit attempted abuse. Its reliability role is to prevent any single client from consuming so much capacity that other clients are degraded. Both justify the implementation, and the same mechanism serves both.
Guidance
Rate limiting strategies
| Strategy | How it works | Best for |
|---|---|---|
| Fixed window | Count N requests per window (e.g., 100/minute); counter resets at window boundary | Simple to implement; vulnerable to burst at window boundary |
| Sliding window | Count requests over the last N seconds continuously | More accurate; eliminates boundary burst |
| Token bucket | Tokens refill at a fixed rate; requests consume tokens; allows bursts up to bucket size | Natural burst allowance with average rate enforcement |
| Leaky bucket | Requests enter a queue and are processed at a fixed rate, regardless of arrival rate | Smooths burst traffic; adds latency |
Token bucket is the most commonly recommended algorithm for API rate limiting.
Implementation in Node.js (express-rate-limit)
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
// General API rate limit
const apiLimiter = rateLimit({
windowMs: 60 * 1000, // 1 minute
max: 100, // 100 requests per window per IP
standardHeaders: true, // Include RateLimit-* headers in responses
legacyHeaders: false,
store: new RedisStore({ client: redis }), // Redis for distributed rate limiting
message: { error: 'Too many requests. Please retry after a minute.' },
});
// Stricter limit for authentication endpoints
const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 10, // 10 login attempts per 15-minute window
store: new RedisStore({ client: redis }),
skipSuccessfulRequests: true, // Only count failed attempts
});
app.use('/api', apiLimiter);
app.use('/api/auth/login', authLimiter);
app.use('/api/auth/password-reset', authLimiter);
Rate limit by identity, not just IP
IP-based rate limiting is easy to bypass with IP rotation. For authenticated endpoints, limit by user ID:
const authenticatedLimiter = rateLimit({
windowMs: 60 * 1000,
max: 200,
keyGenerator: (req) => {
// Rate limit by user ID for authenticated requests
return req.user?.id ?? req.ip;
},
store: new RedisStore({ client: redis }),
});
Response headers
Communicate rate limit status to well-behaved clients:
HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 73
RateLimit-Reset: 1746047160
On limit exceeded:
HTTP/1.1 429 Too Many Requests
Retry-After: 47
Content-Type: application/json
{"error": "Rate limit exceeded. Retry after 47 seconds."}
Tiered rate limits by client category
const limiterByTier = (req: Request, res: Response, next: NextFunction) => {
const tier = req.user?.subscriptionTier ?? 'anonymous';
const limits = {
anonymous: { max: 20, windowMs: 60_000 },
free: { max: 100, windowMs: 60_000 },
pro: { max: 1000, windowMs: 60_000 },
enterprise: { max: 10000, windowMs: 60_000 },
};
return rateLimit({ ...limits[tier], store: redisStore })(req, res, next);
};
Critical endpoints requiring rate limiting
- Authentication (login, OAuth callbacks)
- Password reset / magic link generation
- OTP/2FA verification
- Account creation / registration
- Any financial operation (payment, withdrawal)
- Any endpoint that sends email or SMS
- Search endpoints that are computationally expensive
- File upload endpoints
Design checklist
- All public-facing endpoints have rate limits appropriate to their function
- Authentication endpoints have stricter limits than general API endpoints
- Rate limiting uses distributed state (Redis) — not in-process memory — in multi-instance deployments
- Rate limit responses use HTTP 429 with
Retry-Afterheader - Rate limit headers are returned on all responses for client-side backoff
- Rate limiting cannot be bypassed by rotating IP addresses for authenticated endpoints