Test-Driven Development (TDD)
Status: Complete
Category: Testing
Default enforcement: Soft
Author: PushBackLog team
Tags
- Topic: testing, quality
- Methodology: TDD
- Skillset: any
- Technology: generic
- Stage: refinement, execution
Summary
Test-Driven Development is a development methodology in which tests are written before implementation code. The cycle is: Red (write a failing test that describes the desired behaviour) → Green (write the minimum code needed to make the test pass) → Refactor (clean up the implementation while keeping tests green).
TDD produces code that is testable by construction, drives minimal implementations, and creates a living specification of the system’s behaviour.
Rationale
Writing tests after implementation is a discipline most developers intend to follow and most teams struggle to maintain. Under time pressure, tests get skipped. The result is a codebase that works today but becomes progressively harder to change safely.
TDD inverts the order. When a test is written first, the developer is forced to think about the interface of the code before writing it — what it should do, what inputs it accepts, and what outputs it should produce. This surface-first thinking consistently leads to better-designed, more modular code.
For AI persona execution, TDD provides a precise contract. A persona given failing tests as part of a work item has an unambiguous definition of success. It can iterate until the tests pass, and the tests serve as the acceptance criteria. This makes TDD one of the most effective practices for AI-assisted development.
Guidance
The Red-Green-Refactor cycle
| Step | What happens | Rule |
|---|---|---|
| Red | Write a failing test that describes one piece of desired behaviour | Test must fail for the right reason (your code doesn’t exist yet, not a syntax error) |
| Green | Write the minimum code to make the test pass | No more than necessary; “fake it till you make it” is valid temporarily |
| Refactor | Clean up implementation and tests; improve naming, remove duplication | All tests must remain green; no new functionality |
The discipline of the cycle: never skip Refactor, and never write production code without a failing test asking for it.
What makes a good first test
- It tests a single, specific behaviour (not all inputs, not all edge cases at once)
- It expresses intent in the test name:
it('rejects an order with no items') - It is small enough to pass with the simplest possible implementation
- It fails before any implementation exists
Start with the happy path or the simplest case. Add edge cases and error cases as subsequent Red steps.
When TDD is impractical
TDD works best for logic-heavy code with clear inputs and outputs. It is harder to apply to:
- Exploratory spikes: when you don’t yet know the right abstraction, write throwaway code first, then delete it and TDD the real implementation
- UI and layout: visual appearance is better explored interactively; TDD unit logic and integration points
- Legacy code with no test seam: use characterisation tests first to establish a baseline, then TDD new behaviour added in that area
Examples
Red-Green-Refactor in practice
// Step 1: RED — write the failing test
it('calculates the total for a single item order', () => {
const total = calculateOrderTotal([{ price: 1000, quantity: 2 }]);
expect(total).toBe(2000);
});
// Fails: calculateOrderTotal is not defined
// Step 2: GREEN — minimum implementation
function calculateOrderTotal(items: { price: number; quantity: number }[]): number {
return items[0].price * items[0].quantity; // Fake it: works for one item
}
// Test passes
// Step 3: RED again — a second test exposes the fake
it('calculates the total for a multi-item order', () => {
const total = calculateOrderTotal([
{ price: 1000, quantity: 2 },
{ price: 500, quantity: 3 },
]);
expect(total).toBe(3500);
});
// Fails: fake implementation returns 2000, not 3500
// GREEN — generalise
function calculateOrderTotal(items: { price: number; quantity: number }[]): number {
return items.reduce((sum, item) => sum + item.price * item.quantity, 0);
}
// Both tests pass
// REFACTOR — extract type, improve names
type OrderItem = { price: number; quantity: number };
function calculateOrderTotal(items: OrderItem[]): number {
return items.reduce((sum, item) => sum + item.price * item.quantity, 0);
}
TDD as a design tool
Writing a test before the implementation reveals API design problems before they are baked in. If the test setup is awkward — requiring many dependencies, complex construction, or fragile state management — the code being tested has a design problem. TDD makes this friction visible at the point when it is cheapest to fix.
Anti-patterns
1. Writing tests after implementation and calling it TDD
Post-implementation tests confirm the code does what it currently does, not what it should do. The design benefits of TDD (interface-first thinking, minimal implementation) are not available after the fact. Write tests after implementation only when TDD genuinely wasn’t viable.
2. Testing implementation details rather than behaviour
// Tests the internal method name, not the behaviour
it('calls _computeItemSubtotal for each item', () => {
const spy = jest.spyOn(service, '_computeItemSubtotal');
service.calculateTotal(items);
expect(spy).toHaveBeenCalledTimes(items.length);
});
Tests tied to private methods or internal call structure break with every refactor. Test the public contract: given these inputs, expect this output.
3. Skipping the Refactor step
Red → Green without Refactor produces test-covered spaghetti. The tests are green but the code still has duplication, poor names, and structural problems. The Refactor step is not optional; it is where the design materialises.
4. Over-mocking to the point of meaninglessness
// This test doesn't test anything; everything is mocked
it('places an order', async () => {
mockValidator.validate.mockResolvedValue(true);
mockRepo.save.mockResolvedValue({ id: '123' });
mockNotifier.notify.mockResolvedValue(undefined);
const result = await placeOrder(mockValidator, mockRepo, mockNotifier, orderData);
expect(result.id).toBe('123'); // Trivially true; the mock returns it
});
When every dependency is mocked, the test only verifies that the code calls mocks in a specific order. Test with real implementations where feasible; reserve mocks for boundaries (network, database) and slow dependencies.
Related practices
- Behaviour-Driven Development (BDD)
- The Test Pyramid
- Unit vs Integration vs E2E Testing
- Acceptance Criteria Quality
Part of the PushBackLog Best Practices Library. Suggest improvements →