The Test Pyramid

Status: Complete
Category: Testing
Default enforcement: Soft
Author: PushBackLog team

Summary

The test pyramid describes the ideal distribution of automated tests: many unit tests at the base, fewer integration tests in the middle, and a small number of end-to-end tests at the top. A healthy test suite is fast, reliable, and provides precise failure diagnosis — properties that correlate strongly with having more tests at the lower levels.

Rationale

Speed and feedback loops

Mike Cohn introduced the test pyramid in Succeeding with Agile (2009) to make explicit what was otherwise an emerging practical wisdom: that different categories of test offer fundamentally different trade-offs. Unit tests run in milliseconds, give precise failure messages, and require no infrastructure. End-to-end tests take minutes, produce ambiguous failures (“the test failed” doesn’t tell you which line), and require the entire stack to be running. A suite dominated by E2E tests is slow, fragile, and expensive to maintain.

The pyramid is a ratio recommendation:

         /\
        /E2E\         Small number — cover critical user journeys only
       /------\
      /  Integ  \     Moderate number — verify contracts between components
     /------------\
    /  Unit Tests  \  Large number — verify all logic in isolation
   /________________\

Confidence vs cost

Each layer offers different confidence at different cost:

Layer	Confidence about	Cost
Unit	Internal logic is correct	Low (fast, no infra)
Integration	Components interoperate correctly	Medium (some infra, slower)
E2E	Full user journeys work end-to-end	High (full infra, slowest)

The pyramid says: buy the cheap confidence in bulk, buy the expensive confidence selectively.

The inverted pyramid problem

Many teams naturally gravitate toward E2E tests because they feel more “real” — if the UI works, everything must work. In practice, E2E-heavy suites become so slow that builds take 30+ minutes, tests are skipped under pressure, and the suite becomes a liability rather than an asset. The pyramid corrects this instinct.

Guidance

What to test at each layer

Layer	Test here	Don’t test here
Unit	Business logic, algorithms, transformations, validation, error paths	Integration wiring, UI rendering, external calls
Integration	Repository queries, service-to-service contracts, message queue consumers, API contract	Individual algorithm logic (unit territory), full user scenarios
E2E	Critical user journeys (signup, checkout, core workflow)	Every edge case, every error state, internal logic

Approximate ratios

A common target: 70% unit / 20% integration / 10% E2E. The exact ratio should reflect the architecture. A pure serverless system with many small functions and external integrations may have more integration tests. A monolith with complex domain logic may be almost entirely unit tests.

The “just enough E2E” principle

E2E tests should cover the scenarios where a bug would cost the most: signup, login, payment, and the core value delivery path of the product. Everything else can be covered more cheaply at a lower layer.

Examples

Correctly placed tests for an order system

Unit tests:
  - OrderCalculator.applyDiscount() correctly caps at MAX_RATE
  - OrderValidator.validate() rejects empty carts
  - PriceFormatter.format() handles currency localisation

Integration tests:
  - OrderRepository.findById() returns correct order from test DB
  - PaymentService correctly calls Stripe with formatted amount
  - OrderCreatedEvent is published to queue on successful order

E2E tests:
  - A user can add items to cart and complete checkout
  - A user with an expired card sees a payment failure message
  - A logged-out user is redirected to login before checkout

Misplaced tests (anti-pyramid)

// Wrong: testing algorithm logic via E2E
test('discount is applied correctly to premium users', async () => {
  await loginAs('premium-user@example.com');
  await navigateTo('/products/widget');
  await clickButton('Add to cart');
  await navigateTo('/checkout');
  // ... 15 more steps just to verify a discount calculation
});

// Right: test the discount logic in a unit test
test('calculateDiscount applies 15% for premium users', () => {
  expect(calculateDiscount(100, 'premium')).toBe(85);
});

Anti-patterns

1. The inverted pyramid

Most tests are E2E. Build is slow (30+ minutes). Tests are flaky because they depend on timing and full-stack state. New test failures are vague and slow to diagnose. Teams begin skipping the suite.

2. The ice-cream cone

Atop the inverted pyramid sits a large block of manual QA. Tests are slow and manual, bottlenecking every release. The automation layer is thin, fragile, and untrusted.

3. Integration tests for unit-territory logic

Writing a test that spins up a database just to verify a discount calculation. The unit layer is the cheap, correct place for this.

4. Unit tests with too many mocks

A unit test that mocks every collaborator is only testing that the function calls its mocks in the right order — it’s not testing logic at all. Over-mocking is a sign that logic should either be genuinely isolated or tested at the integration layer with real collaborators.

5. E2E tests for every story

“Every feature gets an E2E test” sounds rigorous but creates fragile infrastructure. E2E tests cover journeys, not features. Most story-level behaviour belongs at the unit or integration layer.

Part of the PushBackLog Best Practices Library. Suggest improvements →

The Test Pyramid

The Test Pyramid

Tags

Summary

Rationale

Speed and feedback loops

Confidence vs cost

The inverted pyramid problem

Guidance

What to test at each layer

Approximate ratios

The “just enough E2E” principle

Examples

Correctly placed tests for an order system

Misplaced tests (anti-pyramid)

Anti-patterns

1. The inverted pyramid

2. The ice-cream cone

3. Integration tests for unit-territory logic

4. Unit tests with too many mocks

5. E2E tests for every story

Related practices