Tutorial 10: Add Tests to Legacy Code

Contributor
May 2
4 min read

You inherited code with no tests. The "obvious" answer is to rewrite. The realistic answer is to add tests around the existing code so you can change it safely. This tutorial walks through it.

What You'll Build

Characterization tests for a piece of legacy code — tests that document what the code currently does, not what it should do.

Step 1: Pick the Right Code (5 min)

Start with code that:

You'll need to modify soon (otherwise: not worth the investment)
Has clear inputs and outputs
Is small enough to work with (a function, not a 5000-line file)

Avoid: code that's about to be deleted, code with no clear interface, code that requires major refactoring before any test is possible.

Step 2: Understand the Boundary (10 min)

For the function/class you're testing:

What inputs does it take?
What outputs does it produce?
What side effects does it have (DB writes, API calls, etc.)?

Don't read every line. Identify the surface.

Step 3: Write Characterization Tests (30 min)

Characterization tests document what the code currently does, regardless of whether that's "right." The point is to lock in current behavior so changes are visible.

def test_legacy_calculate_returns_x_for_input_y():
    # What does the current code do?
    result = legacy_calculate(100, "premium")
    assert result == 10  # Just whatever the code returns

Run the test. If you don't know the expected value, set the assertion to something obviously wrong, run, and read the actual:

def test_legacy_calculate_returns_x_for_input_y():
    result = legacy_calculate(100, "premium")
    assert result == "TBD"  # placeholder; will fail with actual value

The failure message tells you the actual: assert 10 == 'TBD'. Update the test:

def test_legacy_calculate_returns_x_for_input_y():
    result = legacy_calculate(100, "premium")
    assert result == 10

Repeat for various inputs. Build up a corpus.

Step 4: Cover the Branches (20 min)

Read the code (or run with prints) to identify branches. Write a test for each:

def test_legacy_calculate_branch_a():
    # Triggers the first branch
    assert legacy_calculate(100, "premium") == 10

def test_legacy_calculate_branch_b():
    # Triggers the second branch
    assert legacy_calculate(100, "gold") == 20

def test_legacy_calculate_else_branch():
    # Falls through
    assert legacy_calculate(100, "standard") == 0

Coverage tools help here. Run tests with coverage:

pytest --cov=yourapp --cov-report=html

Open the report. Lines not covered are branches you haven't exercised. Add tests until coverage of this function is high.

Step 5: Handle Side Effects (15 min)

If the code writes to a database or makes API calls, your test needs to handle that.

Option A: Use the integration test pattern from Tutorial 3:

def test_legacy_create_user_writes_to_db(db_session):
    legacy_create_user(db_session, "Sam", "sam@example.com")
    
    user = db_session.query(User).filter_by(email="sam@example.com").first()
    assert user is not None

Option B: Refactor minimally to make testing possible:

# Before
def legacy_function():
    db = open_db_connection()
    # ... does stuff

# After (small change)
def legacy_function(db=None):
    if db is None:
        db = open_db_connection()
    # ... does stuff

Now the test can pass a test database. Minimal change to the legacy code.

Step 6: Make the Tests Stable (10 min)

Legacy code often has hidden dependencies. Check:

Does it depend on the current time? Mock the clock.
Does it depend on random numbers? Seed them.
Does it depend on environment variables? Set them in tests.
Does it depend on order of dict iteration? Sort before asserting.

Each instability is a bug waiting to surface. Fix the test, or document and accept it.

Step 7: Verify Tests Catch a Regression (5 min)

Deliberately introduce a small bug:

def legacy_calculate(price, tier):
    if tier == "premium":
        return price * 0.15  # was 0.10
    # ...

Run tests. They should fail (because behavior changed). Revert the deliberate bug.

This confirms the tests are actually monitoring behavior.

Step 8: Now Refactor Safely (varies)

With tests in place, you can change the code. The tests are your safety net.

Common refactors:

Rename functions, parameters, variables
Extract methods
Replace nested ifs with table lookup
Add type hints
Improve naming

After each change: run tests. Still passing? Continue. Failing? You changed behavior; revert or update tests.

Step 9: Distinguish "Document" From "Fix" (ongoing)

Characterization tests document current behavior. Sometimes that behavior is wrong.

If you find a bug:

Decide: fix now or later?
If now: update both the code and the test
If later: leave a comment in the test (# TODO: fix bug — should be X)

Don't silently "fix" bugs while characterizing — the test then doesn't match production. Make the decision deliberate.

Step 10: Build Up Coverage Over Time (ongoing)

You don't have to test all the legacy code at once. Each time you change a function:

Add tests around it first
Make the change
Tests catch any regression

Over months, the legacy code accrues tests. The most-changed parts get the best coverage. That's the right distribution.

What You Just Did

You brought legacy code under test without rewriting it. The tests document current behavior, enable safe modification, and accumulate over time.

Common Failure Modes

Trying to test everything at once. Months of effort with no value. Test what you'll change.

Refactoring before testing. Now you don't know if your refactor preserved behavior.

Testing what the code "should do." That's a unit test. Characterization tests document what it actually does.

Skipping side-effect tests. Function changes the DB; tests don't verify. Bugs hide.

Not running coverage. You think you've covered the branches; coverage report shows you haven't.

You're Done

This completes the beginner testing path. You can write unit tests, integration tests, E2E tests, API tests, debug failures, and bring legacy code under test.

The next path — Test Automation in Practice — extends these into a maintainable suite at scale.

ShiftQuality