Unit Testing Best Practices: A Practical Guide

Contributor
Mar 6
5 min read

A unit test exercises a single piece of code in isolation. The promise: fast feedback that catches regressions, locks in behavior, and documents intent. The reality is messier — unit tests can also become brittle ceremony that slows the team down. The difference between the two is craft.

What a Unit Test Is

A unit test:

Exercises one unit (a function, a small class, a small module)
Runs in milliseconds
Doesn't depend on filesystem, network, database, or external state
Is isolated from other tests
Tests behavior, not implementation

The phrase "in isolation" matters. A test that touches the database is an integration test, not a unit test. Both are useful; they're different things.

What to Test

The decision of what to write tests for is more important than how to write them.

Test:

Business logic. Pricing rules, validation, transformations.
Edge cases. Empty inputs, boundary values, error conditions.
Bug fixes. Each fixed bug should leave behind a test that prevents regression.
Non-obvious behavior. Anything where the code reads one way but means another.

Don't test:

Trivial code. Getters and setters with no logic.
Third-party libraries. Trust they're tested by their authors.
Framework boilerplate. Test what you wrote, not what the framework provides.
UI rendering details. UI tests at this level are brittle; use higher-level tests.

The 100% coverage goal often produces tests for trivial code, missing the actual gaps. Coverage is a leading indicator, not a guarantee.

Structure: Arrange-Act-Assert

The canonical pattern:

def test_discount_applied_for_premium_customer():
    # Arrange
    customer = PremiumCustomer()
    cart = Cart([Item(price=100)])
    
    # Act
    total = cart.total(customer)
    
    # Assert
    assert total == 90  # 10% premium discount

Three sections, in order:

Arrange: set up the test's preconditions
Act: invoke the behavior under test
Assert: verify the expected outcome

Tests that don't follow this structure tend to mix setup and assertion, becoming hard to read.

Naming

Test names should describe the behavior being verified.

Bad: test_total, test_1, test_works

Good: test_discount_applied_for_premium_customer, test_returns_empty_list_for_no_results, test_raises_error_for_negative_quantity

A good test name lets the failing test message tell you what's broken without opening the test code.

One Assertion (Mostly)

The "one assertion per test" rule is widely cited and often misapplied.

The real rule: one concept per test. Multiple assertions verifying the same concept are fine; assertions covering different concepts should be different tests.

def test_user_created_with_correct_fields():
    user = User.create(name="Sam", email="sam@example.com")
    
    # All asserting on the same concept (user creation)
    assert user.name == "Sam"
    assert user.email == "sam@example.com"
    assert user.created_at is not None

Three assertions, one concept. Fine.

def test_user_creation_and_authentication():
    user = User.create(name="Sam", email="sam@example.com")
    assert user.id is not None
    
    token = authenticate(user, password="...")
    assert token.is_valid

Two concepts mixed. Split into two tests.

Isolation Between Tests

Tests should not depend on each other's order. If test_a creates state that test_b depends on, the suite breaks when tests run in parallel or in different order.

Practices:

Each test creates the state it needs
Test setup uses fresh fixtures (or fixtures reset between tests)
No shared mutable state between tests
No "test order matters" patterns

The test runner should be able to run tests in any order or in parallel without changing the outcome.

Test Doubles: Mocks, Stubs, Fakes

When the unit under test depends on something else, you have options:

Real dependencies: if they're fast and isolated, use them
Stub: a stand-in returning canned responses
Mock: a stand-in that records interactions; assertions verify it was called correctly
Fake: a working alternative implementation (e.g., in-memory database)

Default to the simplest: real dependencies if possible, then fakes, then stubs, then mocks. Mocks have the highest maintenance cost because they couple tests to implementation.

The classic mocking trap: tests that pass even when the code is broken, because the mocks lie. If you find yourself mocking what you're testing, step back.

Speed Matters

A test suite that takes 30 minutes is a suite that runs once a day. A suite that takes 30 seconds is a suite that runs after every change. The difference is enormous for development velocity.

Targets:

Unit suite: under 60 seconds for any change
Individual unit test: under 100ms

Slow unit tests usually mean:

Touching real I/O (should be a test double)
Setting up too much state
Testing implementation indirectly (run faster by testing it directly)

Treat slow unit tests as a defect to be fixed, not a fact of life.

Flakiness

Flaky tests — tests that sometimes pass and sometimes fail without code changes — are worse than no tests. They train the team to ignore failures.

Common causes:

Time-dependent code without time mocking
Async code without proper synchronization
Shared state between tests
Race conditions
External dependencies

Quarantine flaky tests immediately. Fix them as a priority. Don't let them sit in the suite teaching the team that red is normal.

Coverage as a Tool, Not a Target

Code coverage tells you what wasn't exercised. Useful as a diagnostic; misleading as a target.

100% line coverage doesn't mean 100% behavior coverage. You can have full coverage with no useful assertions.

Use coverage to find untested code, not to declare testing done. Aim for "the important code is well-tested," not "the coverage number is high."

Tests as Documentation

Well-written tests describe how the system is supposed to behave. They're often more accurate than the docs.

This works when test names and structure communicate intent. It doesn't work when tests are written reluctantly to satisfy coverage targets.

For tests-as-documentation to work:

Names describe behavior, not mechanics
Tests are organized by feature, not by structural location
Tests are read as often as they're written

When TDD Helps

Test-driven development — writing the test first, then the code — works well for:

New behavior in a defined area
Refactoring with intent to preserve behavior
Bug fixes (reproduce the bug as a test, then fix)

It works less well for:

Exploratory work where the shape isn't known
UI work where the test is harder than the code
Generated or boilerplate code

Don't be religious about TDD. It's a tool. Use it where it helps.

What Bad Unit Tests Look Like

Testing what the framework does ("does the ORM return a row when I save?")
Mocking the system under test
Brittle assertions tied to implementation details
Tests that pass without running the production code
Tests that test the test (over-engineered helpers)
Test files larger than the code they test

Key Takeaway

A good unit test is fast, isolated, focused on behavior, and named to describe what it verifies. Test business logic, edge cases, and bug fixes; skip trivial code and third-party libraries. Use Arrange-Act-Assert. Default to real dependencies, then fakes, then stubs, then mocks. Treat slow and flaky tests as bugs. Use coverage to find gaps, not to claim victory. Tests should be readable as documentation of how the system behaves.

ShiftQuality