Load & Performance Testing
Status: Complete
Category: Testing
Default enforcement: Soft
Author: PushBackLog team
Tags
- Topic: testing, performance, reliability
- Skillset: backend, devops
- Technology: generic
- Stage: execution, review
Summary
Load testing verifies that a system meets its performance requirements under realistic and peak traffic conditions before those conditions occur in production. Unlike unit or integration tests, load tests reveal system-level properties — throughput, latency percentiles, resource exhaustion, and degradation under stress — that only emerge at scale. Running load tests before launch or after significant architectural changes is one of the most reliable ways to prevent avoidable production performance incidents.
Rationale
Performance requirements not specified are performance requirements not met
A feature that works correctly for one user may behave unacceptably for ten thousand. Database queries that complete in 10ms against 1,000 records take 30 seconds against 10,000,000. Thread pools that handle 50 concurrent requests safely saturate and queue at 500. These failures are entirely predictable and entirely preventable — but only if tested before they occur in production.
The most common failure mode is not that teams do not care about performance, but that they do not define performance requirements in measurable terms and do not verify them before shipping. “It should be fast” is not a requirement. “p95 response time < 500ms at 1,000 concurrent users” is.
Production is not the right place to discover capacity limits
Discovering a capacity ceiling in production during a peak traffic event means users experience the failure. Discovering it in a load test means engineers fix it quietly at a time of their choosing. The second outcome is strictly better.
Guidance
Test types
| Test type | Purpose | Characteristics |
|---|---|---|
| Load test | Verify behaviour at expected traffic level | Sustained expected peak load |
| Stress test | Find the breaking point | Gradually increase load until failure |
| Soak test | Detect degradation over time (memory leaks, connection pool exhaustion) | Moderate load sustained for hours |
| Spike test | Verify behaviour under sudden traffic spikes | Rapid ramp from low to high and back |
Setting performance targets
Write performance requirements before writing load tests. Define targets in SLO-compatible terms:
# Performance SLOs for the orders API
- endpoint: GET /orders
p50_response_time: < 100ms
p95_response_time: < 300ms
p99_response_time: < 1000ms
error_rate: < 0.1%
target_concurrent_users: 500
- endpoint: POST /orders
p95_response_time: < 500ms
error_rate: < 0.5%
target_concurrent_users: 100
Example load test (k6)
// k6 load test for order creation endpoint
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Trend, Rate } from 'k6/metrics';
const responseTime = new Trend('response_time');
const errorRate = new Rate('error_rate');
export const options = {
stages: [
{ duration: '2m', target: 50 }, // Ramp up to 50 users
{ duration: '5m', target: 100 }, // Ramp up to expected peak
{ duration: '5m', target: 100 }, // Hold at peak
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% of requests under 500ms
error_rate: ['rate<0.01'], // Error rate under 1%
},
};
export default function () {
const payload = JSON.stringify({
customerId: 'cus_test_123',
items: [{ productId: 'prod_001', quantity: 1 }],
});
const res = http.post('http://api.staging.example.com/orders', payload, {
headers: { 'Content-Type': 'application/json' },
});
responseTime.add(res.timings.duration);
errorRate.add(res.status !== 201);
check(res, {
'status is 201': (r) => r.status === 201,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}
What to measure
| Metric | Meaning |
|---|---|
| Throughput (req/s) | How many requests the system handles per second |
| Latency p50/p95/p99 | Median, 95th, 99th percentile response times |
| Error rate | Percentage of responses with 4xx/5xx status codes |
| CPU/memory utilisation | Resource consumption under load |
| Database connection pool exhaustion | Pool saturation indicates a scalability limit |
| GC pressure | Excessive garbage collection under load (JVM/Node.js) |
Where to run load tests
- Against staging, not production — load tests generate artificial traffic that distorts metrics and may consume real resources (emails, payments)
- With production-representative data volumes — a staging database with 100 rows does not reveal N+1 query problems that only appear at 10 million rows
- As part of CI on a pre-release gate — run a shorter smoke-load test (1 minute, expected volume) on every production deployment
Common performance problems revealed by load tests
- N+1 queries (latency increases linearly with record count)
- Missing database indexes (full table scans appear at high concurrency)
- Thread/connection pool exhaustion (latency spikes and timeouts at concurrent user limits)
- Memory leaks (soak tests — memory grows monotonically over time)
- Synchronous blocking in async frameworks (event loop starvation in Node.js)
- Unindexed pagination (OFFSET pagination degrades at large page numbers)