Mutation Testing

Status: Complete
Category: Testing
Default enforcement: Advisory
Author: PushBackLog team

Summary

Mutation testing measures the quality of a test suite by introducing small, deliberate faults (“mutations”) into the source code and checking whether the test suite detects each one. A mutation that is not caught by any test is a “surviving mutant” — evidence of a gap in test coverage that code coverage metrics would not have detected. Mutation testing reveals the difference between tests that execute code and tests that actually verify behaviour.

Rationale

Code coverage measures execution, not verification

A test suite with 90% line coverage can still be deeply inadequate. Coverage tools report whether a line of code was executed during the test run — not whether any test would fail if that line contained a bug. An assertion-free test, a test with a trivially true assertion, or a test that exercises code without checking results all achieve coverage while providing no protection.

Mutation testing answers the question code coverage cannot: “would our tests catch a bug here?”

Surviving mutants are unrevealed assumptions

A surviving mutant shows that a specific fault — a changed comparison operator, a flipped boolean, a removed condition — would pass the entire test suite undetected. These are meaningful gaps. Either:

The mutant reveals logic that is genuinely untested (needs more tests), or
The mutant reveals behaviour that the team does not consider important to verify (conscious decision)

In both cases, the team has new information.

Guidance

How mutation testing works

1. Take the original passing test suite
2. Generate N mutated copies of the source code, each with one small change:
   - Change > to >= (boundary mutation)
   - Negate a boolean (logical mutation)
   - Remove a function call (statement deletion)
   - Replace + with - (arithmetic mutation)
   - Change && to || (logical connector)
3. Run the full test suite against each mutant
4. If tests pass → mutant survived (bad — tests didn't catch the fault)
5. If tests fail → mutant killed (good — tests caught the fault)
6. Report: mutation score = killed mutants / total mutants

TypeScript example (Stryker)

// stryker.config.json
{
  "mutate": ["src/**/*.ts", "!src/**/*.spec.ts"],
  "testRunner": "jest",
  "reporters": ["html", "clear-text", "progress"],
  "thresholds": { "high": 80, "low": 60, "break": 50 }
}

npx stryker run

Sample output:

Mutation score: 72.3% (312/431 mutants killed)
Survived mutants:

src/billing/invoice.ts:45:12 | BoundaryMutator
- Original:  if (amount > 0) {
+ Mutated:   if (amount >= 0) {
  ... No test caught this mutation

src/auth/token.ts:89:5 | LogicalMutator
- Original:  return isValid && !isExpired;
+ Mutated:   return isValid || !isExpired;
  ... No test caught this mutation

Interpreting results

Mutation score	Interpretation
> 80%	Healthy test suite for the mutated code
60–80%	Adequate; specific gaps visible in the surviving mutants
< 60%	Test suite provides limited confidence — significant gaps

Do not target 100%. Some mutations represent equivalent code (the mutant is semantically identical to the original) or trivially unimportant paths. Use the surviving mutant list to prioritise which gaps matter.

Where mutation testing adds most value

Business-critical calculation logic (pricing, billing, permissions)
Security-critical conditions (authentication checks, authorisation gates)
Complex branching logic with many edge cases
Code that is hard to test manually but critical to get right

Cost management

Mutation testing is expensive — it runs the full test suite once per mutant. For large codebases:

Run mutation testing on specific critical paths in CI (not the entire codebase)
Run full mutation analysis locally or as a nightly job
Use incremental mutation testing — only mutate files changed in the current PR (Stryker supports this)

Tools

Platform	Tool
JavaScript / TypeScript	Stryker Mutator
Java / Kotlin	PIT (Pitest)
Python	mutmut, cosmic-ray
C#	Stryker.NET
Go	gremlins

Mutation Testing

Mutation Testing

Tags

Summary

Rationale

Code coverage measures execution, not verification

Surviving mutants are unrevealed assumptions

Guidance

How mutation testing works

TypeScript example (Stryker)

Interpreting results

Where mutation testing adds most value

Cost management

Tools