Mocking, Stubbing, and Faking: When to Use Each

Contributor
Mar 12
5 min read

The term "mock" gets used for any test double, but mocks, stubs, and fakes are different things. Conflating them produces tests that are unnecessarily brittle or that miss the bugs they're meant to catch. Knowing the difference helps you pick the right tool.

The Vocabulary

Stub: a stand-in that returns canned responses. Doesn't verify how it was called.

Mock: a stand-in that records how it was called. Assertions verify the interactions.

Fake: a working alternative implementation, usually simpler. An in-memory database is a fake of a real database.

Spy: records calls but otherwise delegates to the real implementation.

Dummy: a stand-in that's never actually used, just needed to satisfy a signature.

Most teams call all of these "mocks" and the conflation produces specific problems.

Stubs in Practice

A stub returns predetermined values:

class StubUserRepository:
    def find(self, user_id):
        return User(id=user_id, name="Test User")

The test that uses this stub:

def test_greet_user():
    repo = StubUserRepository()
    service = GreetingService(repo)
    
    greeting = service.greet(user_id=123)
    
    assert greeting == "Hello, Test User"

The test verifies the outcome. It doesn't verify how the service used the repository. The stub just provides the data the service needs.

Stubs are simple and low-maintenance. Use them when you need a dependency to return something but don't care how the system under test uses it.

Mocks in Practice

A mock records calls and lets you assert on them:

def test_payment_charges_card():
    mock_payment_gateway = create_mock()
    service = OrderService(payment_gateway=mock_payment_gateway)
    
    service.process_order(order, card)
    
    assert mock_payment_gateway.charge.called_with(
        amount=100, card=card
    )

The test verifies that the payment gateway was called and how. It doesn't verify the outcome of the call directly.

Mocks are useful when the interaction itself is the contract. "Did we call the payment gateway with the right amount?" is what matters.

Fakes in Practice

A fake is a working alternative implementation:

class FakeEmailService:
    def __init__(self):
        self.sent = []
    
    def send(self, to, subject, body):
        self.sent.append({"to": to, "subject": subject, "body": body})

The test uses the fake just like the real thing, then queries the fake's state:

def test_signup_sends_welcome():
    email_service = FakeEmailService()
    service = SignupService(email=email_service)
    
    service.signup(name="Sam", email="sam@example.com")
    
    assert len(email_service.sent) == 1
    assert email_service.sent[0]["to"] == "sam@example.com"

Fakes are useful when you want realistic behavior without the cost of the real thing.

Picking the Right One

Default order of preference, from most to least preferred:

Real implementation if it's fast and isolated
Fake for things that have significant cost or external dependency
Stub when you need a specific return value but don't care about interactions
Mock when the interaction itself is what you're verifying
Spy for partial real-and-recorded behavior

Mocks come last because they couple the test to the implementation most tightly. Tests with heavy mocks break when the implementation changes legitimately.

The Mocking Anti-Pattern

The most common mocking failure: mocking what you're testing.

def test_calculate_discount():
    mock_calculator = create_mock()
    mock_calculator.calculate.return_value = 10
    
    result = mock_calculator.calculate()
    
    assert result == 10  # Tests the mock, not the code

This test passes regardless of the real calculator's behavior. The mock returns what you told it to return; the assertion verifies what you told it to return.

If you find yourself mocking the unit under test, step back. The test isn't testing anything.

When Mocks Lie

Mocks return what you've programmed them to return. They don't know what the real dependency would do.

If the real Stripe API returns "decline" sometimes, your mocked Stripe doesn't. Your tests will pass with the mock; production will fail with the real thing.

Counter this by:

Preferring fakes when fidelity matters
Using contract tests to verify your mocks match reality
Periodically running tests against real dependencies
Keeping mocks at boundaries (you mock the SDK, not the application's own code)

Boundaries: What to Replace

A working heuristic: replace what crosses a system boundary; use the real thing for code inside your boundary.

Replace: external APIs, the database (when slow), the filesystem, the clock, random sources, message queues
Don't replace: your own classes, internal services, pure logic

When you replace internal code with mocks, you couple tests to implementation. When you replace external dependencies with fakes/stubs/mocks, you test the boundary.

The Real Database in Tests

A specific case worth highlighting: should you use the real database?

Yes, usually. Modern databases are fast enough for tests. Containerized databases start in seconds. Tests against the real DB catch SQL issues, schema issues, and transaction issues that mocks miss.

Use mocks/stubs for the database only when:

Specific tests require deterministic behavior (no real DB can guarantee)
Performance constraints make it impractical (rare)
You're testing code that shouldn't actually touch the DB

Verification Styles

Two ways to verify with mocks:

Strict (verify all interactions): the test fails if any unexpected call happens. Catches over-broad behavior.

Loose (verify specific interactions): the test verifies what you care about; ignores other calls. More forgiving but can miss issues.

Default to loose verification. Reach for strict only when you genuinely need to constrain all interactions.

Test Maintenance

Tests with heavy mocking are the most expensive to maintain because they break when implementation changes.

A signal of mock-heaviness:

Tests fail when you refactor without changing behavior
Test setup is longer than the assertions
Tests have detailed knowledge of internal method names

When you see this, consider:

Moving the test to a higher level (integration)
Restructuring the code to need less mocking
Accepting the maintenance cost if the test is high-value

Frameworks

Mocking frameworks abstract the creation of test doubles. Popular ones:

Python: unittest.mock, pytest-mock
JavaScript: Jest, Sinon
Java: Mockito
Go: gomock (but Go's interface-friendliness often makes manual fakes easier)
.NET: Moq, NSubstitute

Framework choice matters less than how you use it. A mock written with Mockito is no better than a hand-rolled stub if the mock-vs-stub distinction is wrong.

Verifying Behavior vs. Verifying Calls

A core distinction:

Behavior verification: what does the system produce?
Interaction verification: how does the system communicate with collaborators?

Behavior verification (stubs) is more durable. Interaction verification (mocks) is more brittle.

Use interaction verification when the interaction is the contract — calling the payment gateway, sending the webhook, publishing the event. Use behavior verification for everything else.

Anti-Patterns

Mock everything by reflex. Tests pass with mocks; production fails. Mock at boundaries only.

Mocking what you don't own. Wrap third-party libraries in interfaces you control; mock those.

Asserting on internal calls. Tests fail on legitimate refactoring.

Mocks that drift from reality. Real Stripe changed; your mock didn't. Use contract testing.

Verbose mock setup. Mocks should be quick to read. Long setup signals over-mocking.

Key Takeaway

Stubs return canned values; mocks record interactions; fakes are working alternatives. Default order: real implementation, fake, stub, mock — preferring less-coupled doubles. Mock at system boundaries, not inside your own code. Avoid mocking what you're testing. Watch for mocks that lie because they don't track reality. Tests with heavy mocking are expensive to maintain — they're often a signal to test at a higher level or restructure the code.

ShiftQuality