Load & Performance Testing

Status: Complete
Category: Testing
Default enforcement: Soft
Author: PushBackLog team

Summary

Load testing verifies that a system meets its performance requirements under realistic and peak traffic conditions before those conditions occur in production. Unlike unit or integration tests, load tests reveal system-level properties — throughput, latency percentiles, resource exhaustion, and degradation under stress — that only emerge at scale. Running load tests before launch or after significant architectural changes is one of the most reliable ways to prevent avoidable production performance incidents.

Rationale

Performance requirements not specified are performance requirements not met

A feature that works correctly for one user may behave unacceptably for ten thousand. Database queries that complete in 10ms against 1,000 records take 30 seconds against 10,000,000. Thread pools that handle 50 concurrent requests safely saturate and queue at 500. These failures are entirely predictable and entirely preventable — but only if tested before they occur in production.

The most common failure mode is not that teams do not care about performance, but that they do not define performance requirements in measurable terms and do not verify them before shipping. “It should be fast” is not a requirement. “p95 response time < 500ms at 1,000 concurrent users” is.

Production is not the right place to discover capacity limits

Discovering a capacity ceiling in production during a peak traffic event means users experience the failure. Discovering it in a load test means engineers fix it quietly at a time of their choosing. The second outcome is strictly better.

Guidance

Test types

Test type	Purpose	Characteristics
Load test	Verify behaviour at expected traffic level	Sustained expected peak load
Stress test	Find the breaking point	Gradually increase load until failure
Soak test	Detect degradation over time (memory leaks, connection pool exhaustion)	Moderate load sustained for hours
Spike test	Verify behaviour under sudden traffic spikes	Rapid ramp from low to high and back

Setting performance targets

Write performance requirements before writing load tests. Define targets in SLO-compatible terms:

# Performance SLOs for the orders API
- endpoint: GET /orders
  p50_response_time: < 100ms
  p95_response_time: < 300ms
  p99_response_time: < 1000ms
  error_rate: < 0.1%
  target_concurrent_users: 500

- endpoint: POST /orders
  p95_response_time: < 500ms
  error_rate: < 0.5%
  target_concurrent_users: 100

Example load test (k6)

// k6 load test for order creation endpoint
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Trend, Rate } from 'k6/metrics';

const responseTime = new Trend('response_time');
const errorRate = new Rate('error_rate');

export const options = {
  stages: [
    { duration: '2m', target: 50 },   // Ramp up to 50 users
    { duration: '5m', target: 100 },  // Ramp up to expected peak
    { duration: '5m', target: 100 },  // Hold at peak
    { duration: '2m', target: 0 },    // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% of requests under 500ms
    error_rate: ['rate<0.01'],          // Error rate under 1%
  },
};

export default function () {
  const payload = JSON.stringify({
    customerId: 'cus_test_123',
    items: [{ productId: 'prod_001', quantity: 1 }],
  });

  const res = http.post('http://api.staging.example.com/orders', payload, {
    headers: { 'Content-Type': 'application/json' },
  });

  responseTime.add(res.timings.duration);
  errorRate.add(res.status !== 201);

  check(res, {
    'status is 201': (r) => r.status === 201,
    'response time < 500ms': (r) => r.timings.duration < 500,
  });

  sleep(1);
}

What to measure

Metric	Meaning
Throughput (req/s)	How many requests the system handles per second
Latency p50/p95/p99	Median, 95th, 99th percentile response times
Error rate	Percentage of responses with 4xx/5xx status codes
CPU/memory utilisation	Resource consumption under load
Database connection pool exhaustion	Pool saturation indicates a scalability limit
GC pressure	Excessive garbage collection under load (JVM/Node.js)

Where to run load tests

Against staging, not production — load tests generate artificial traffic that distorts metrics and may consume real resources (emails, payments)
With production-representative data volumes — a staging database with 100 rows does not reveal N+1 query problems that only appear at 10 million rows
As part of CI on a pre-release gate — run a shorter smoke-load test (1 minute, expected volume) on every production deployment

Common performance problems revealed by load tests

N+1 queries (latency increases linearly with record count)
Missing database indexes (full table scans appear at high concurrency)
Thread/connection pool exhaustion (latency spikes and timeouts at concurrent user limits)
Memory leaks (soak tests — memory grows monotonically over time)
Synchronous blocking in async frameworks (event loop starvation in Node.js)
Unindexed pagination (OFFSET pagination degrades at large page numbers)

Load & Performance Testing

Load & Performance Testing

Tags

Summary

Rationale

Performance requirements not specified are performance requirements not met

Production is not the right place to discover capacity limits

Guidance

Test types

Setting performance targets

Example load test (k6)

What to measure

Where to run load tests

Common performance problems revealed by load tests