The Test Pyramid
Status: Complete
Category: Testing
Default enforcement: Soft
Author: PushBackLog team
Tags
- Topic: testing, quality, architecture
- Skillset: any
- Technology: generic
- Stage: execution, review
Summary
The test pyramid describes the ideal distribution of automated tests: many unit tests at the base, fewer integration tests in the middle, and a small number of end-to-end tests at the top. A healthy test suite is fast, reliable, and provides precise failure diagnosis — properties that correlate strongly with having more tests at the lower levels.
Rationale
Speed and feedback loops
Mike Cohn introduced the test pyramid in Succeeding with Agile (2009) to make explicit what was otherwise an emerging practical wisdom: that different categories of test offer fundamentally different trade-offs. Unit tests run in milliseconds, give precise failure messages, and require no infrastructure. End-to-end tests take minutes, produce ambiguous failures (“the test failed” doesn’t tell you which line), and require the entire stack to be running. A suite dominated by E2E tests is slow, fragile, and expensive to maintain.
The pyramid is a ratio recommendation:
/\
/E2E\ Small number — cover critical user journeys only
/------\
/ Integ \ Moderate number — verify contracts between components
/------------\
/ Unit Tests \ Large number — verify all logic in isolation
/________________\
Confidence vs cost
Each layer offers different confidence at different cost:
| Layer | Confidence about | Cost |
|---|---|---|
| Unit | Internal logic is correct | Low (fast, no infra) |
| Integration | Components interoperate correctly | Medium (some infra, slower) |
| E2E | Full user journeys work end-to-end | High (full infra, slowest) |
The pyramid says: buy the cheap confidence in bulk, buy the expensive confidence selectively.
The inverted pyramid problem
Many teams naturally gravitate toward E2E tests because they feel more “real” — if the UI works, everything must work. In practice, E2E-heavy suites become so slow that builds take 30+ minutes, tests are skipped under pressure, and the suite becomes a liability rather than an asset. The pyramid corrects this instinct.
Guidance
What to test at each layer
| Layer | Test here | Don’t test here |
|---|---|---|
| Unit | Business logic, algorithms, transformations, validation, error paths | Integration wiring, UI rendering, external calls |
| Integration | Repository queries, service-to-service contracts, message queue consumers, API contract | Individual algorithm logic (unit territory), full user scenarios |
| E2E | Critical user journeys (signup, checkout, core workflow) | Every edge case, every error state, internal logic |
Approximate ratios
A common target: 70% unit / 20% integration / 10% E2E. The exact ratio should reflect the architecture. A pure serverless system with many small functions and external integrations may have more integration tests. A monolith with complex domain logic may be almost entirely unit tests.
The “just enough E2E” principle
E2E tests should cover the scenarios where a bug would cost the most: signup, login, payment, and the core value delivery path of the product. Everything else can be covered more cheaply at a lower layer.
Examples
Correctly placed tests for an order system
Unit tests:
- OrderCalculator.applyDiscount() correctly caps at MAX_RATE
- OrderValidator.validate() rejects empty carts
- PriceFormatter.format() handles currency localisation
Integration tests:
- OrderRepository.findById() returns correct order from test DB
- PaymentService correctly calls Stripe with formatted amount
- OrderCreatedEvent is published to queue on successful order
E2E tests:
- A user can add items to cart and complete checkout
- A user with an expired card sees a payment failure message
- A logged-out user is redirected to login before checkout
Misplaced tests (anti-pyramid)
// Wrong: testing algorithm logic via E2E
test('discount is applied correctly to premium users', async () => {
await loginAs('premium-user@example.com');
await navigateTo('/products/widget');
await clickButton('Add to cart');
await navigateTo('/checkout');
// ... 15 more steps just to verify a discount calculation
});
// Right: test the discount logic in a unit test
test('calculateDiscount applies 15% for premium users', () => {
expect(calculateDiscount(100, 'premium')).toBe(85);
});
Anti-patterns
1. The inverted pyramid
Most tests are E2E. Build is slow (30+ minutes). Tests are flaky because they depend on timing and full-stack state. New test failures are vague and slow to diagnose. Teams begin skipping the suite.
2. The ice-cream cone
Atop the inverted pyramid sits a large block of manual QA. Tests are slow and manual, bottlenecking every release. The automation layer is thin, fragile, and untrusted.
3. Integration tests for unit-territory logic
Writing a test that spins up a database just to verify a discount calculation. The unit layer is the cheap, correct place for this.
4. Unit tests with too many mocks
A unit test that mocks every collaborator is only testing that the function calls its mocks in the right order — it’s not testing logic at all. Over-mocking is a sign that logic should either be genuinely isolated or tested at the integration layer with real collaborators.
5. E2E tests for every story
“Every feature gets an E2E test” sounds rigorous but creates fragile infrastructure. E2E tests cover journeys, not features. Most story-level behaviour belongs at the unit or integration layer.
Related practices
Part of the PushBackLog Best Practices Library. Suggest improvements →