Mocking Strategy
Status: Complete
Category: Testing
Default enforcement: Advisory
Author: PushBackLog team
Tags
- Topic: testing, quality
- Skillset: any
- Technology: generic
- Stage: execution, review
Summary
Mocking is the practice of replacing real collaborators (databases, APIs, services) with controlled substitutes in tests. Effective mocking strategy defines when to use mocks, stubs, spies, or fakes — and when to prefer real implementations — in order to keep tests both isolated and meaningful.
Rationale
Why test doubles exist
Unit tests must be fast, deterministic, and isolated. Real databases are slow. Real payment gateways involve network calls. Real file systems are stateful. Test doubles replace these real collaborators with controlled substitutes that:
- Respond predictably to inputs the test controls
- Introduce no state from previous test runs
- Need no external infrastructure to exist
- Can simulate error conditions that are hard to reproduce with real systems
The spectrum of test doubles
Gerard Meszaros catalogued the vocabulary in xUnit Test Patterns (2007). The terms are often used loosely in practice:
| Type | Behaviour |
|---|---|
| Dummy | Passed but never used. Fills a required parameter. |
| Stub | Returns hardcoded responses. No logic. Used for state-based testing. |
| Fake | Simplified real implementation (e.g., in-memory database). Has working logic. |
| Spy | Records calls it receives; can be asserted on afterward. |
| Mock | Pre-programmed with expectations; fails if not called in the expected way. |
Over-mocking: when doubles become the problem
Mocks represent a trade-off. Every mock is a bet that the real collaborator behaves the same way the mock does. When that bet is wrong — the real DB has different null-handling, the real API returns a slightly different shape — the tests still pass but the system fails. Over-mocked tests verify the test setup, not the system.
Guidance
When to mock vs when to use real
| Collaborator | Recommendation |
|---|---|
| External HTTP API (payment, email, SMS) | Mock — flakiness and side effects are unacceptable in tests |
| Database | Use a real test database (via container) for integration tests; mock only at unit level |
| File system | Use temp files or an in-memory filesystem; mock if the file handling is not what’s being tested |
| Another service in the same repository | Prefer a fake or real instance; mock only if it creates infra complexity |
| Clock / time | Always mock — time-dependent tests are non-deterministic |
| Random number generator | Always mock — same reason |
Prefer fakes over mocks for complex collaborators
A fake is a simplified but working implementation: an in-memory repository, an in-process email collector, a local SMTP server. Fakes are more resilient than mocks: they don’t break when internal call signatures change, they catch more real bugs, and they can be shared across tests.
Mock at the boundary, not through it
Mock at the outermost boundary of the system you control. If you’re testing OrderService, mock IPaymentGateway (the boundary), not the internal methods of PaymentGateway (through it). Mocking internal methods couples the test to implementation details that should be free to change.
Verify behaviour, not implementation
Assert on observable outcomes, not on how the implementation achieved them. If a test asserts that logger.info was called three times with specific strings, it is coupled to logging internals that will change. If the test asserts that the returned order has a confirmedAt timestamp, it is verifying observable behaviour.
Examples
Stub for state testing
// Stub: always returns a fixed user, regardless of input
const userRepository: IUserRepository = {
findById: () => Promise.resolve({ id: '1', email: 'a@b.com', loyaltyTier: 'gold' }),
save: () => Promise.resolve(),
};
const service = new DiscountService(userRepository);
const discount = await service.calculateDiscount('1', 100);
expect(discount).toBe(15); // 15% gold tier discount
The stub removes the database dependency. The test verifies the discount logic.
Fake for richer scenarios
// In-memory fake — supports real state across calls
class FakeUserRepository implements IUserRepository {
private store = new Map<string, User>();
async findById(id: string) { return this.store.get(id) ?? null; }
async save(user: User) { this.store.set(user.id, user); }
}
// Can be used across multiple test scenarios without a real DB
const repo = new FakeUserRepository();
await repo.save({ id: '1', email: 'a@b.com', loyaltyTier: 'gold' });
Over-mocking trap
// This test only verifies that the function calls other functions — not that it produces the right result
const mockCalc = jest.fn().mockReturnValue(15);
const mockRepo = { save: jest.fn() };
const service = new OrderService(mockCalc, mockRepo);
await service.processOrder(order);
expect(mockCalc).toHaveBeenCalledWith(order, 'gold'); // Testing call sequence, not outcome
expect(mockRepo.save).toHaveBeenCalled();
This test will pass even if the discount amount is wrong, as long as the functions were called.
Anti-patterns
1. Mocking everything
A test where every collaborator is mocked verifies the test setup, not the system. If refactoring the internal implementation makes tests fail without changing observable behaviour, the tests are too tightly coupled to implementation.
2. Mocking what you don’t own (external libraries)
Mocking axios, pg, or fs directly is brittle. Mock the abstraction you own (e.g., IHttpClient) that wraps the library. Your abstraction is stable; the library’s API may change.
3. Testing mock interactions instead of observable outcomes
Assertions on expect(mockFn).toHaveBeenCalledWith(...) are fine for “a side effect was triggered” — e.g., “an email was sent”. They’re fragile for “the function worked correctly” because they couple to internal mechanics.
4. Shared mutable mock state between tests
Mocks or fakes that retain state between tests produce order-dependent, flaky test suites. Always reset mock state in beforeEach.
5. Mocks that can never fail
A mock that always returns the happy-path response doesn’t test error handling. Explicitly test failure scenarios by configuring your doubles to return errors, null, or unexpected shapes.
Related practices
Part of the PushBackLog Best Practices Library. Suggest improvements →