Behavior-Driven Development (BDD) Without Cargo Cult
- Contributor
- Mar 26
- 5 min read
Behavior-Driven Development is one of the most-misunderstood approaches in testing. Many teams adopt BDD by installing Cucumber, writing Gherkin features, and calling it a day. They've adopted the syntax without the practice. The result is verbose tests that take longer to write and don't deliver the value BDD was meant to provide.
This guide is what BDD actually is, what it's for, and when it's worth adopting.
What BDD Actually Is
BDD is a practice for using examples to drive understanding between business stakeholders and developers. The core idea: instead of arguing about abstract requirements, you build concrete examples of behavior together. The examples become the specification, and (in tool-supported BDD) they become executable tests.
The tools (Cucumber, SpecFlow, etc.) come after the practice. The practice is the collaboration around examples.
The Three Amigos
BDD often happens in "three amigos" sessions: a business representative (PM), a developer, and a tester (or QA representative). They work through examples together.
The session:
Pick a feature or user story
Identify the rules that govern it
For each rule, build concrete examples (happy path, edge case, negative)
Capture the examples in a format that can become tests
The conversation is the point. The examples are the artifact. The tests are downstream.
Gherkin Syntax
The format most associated with BDD is Gherkin: Given-When-Then.
Feature: Workspace Switching
Scenario: User switches workspaces
Given the user is signed in
And they have access to workspaces "Marketing" and "Engineering"
And they are currently in "Marketing"
When they switch to "Engineering"
Then they should be in "Engineering"
And the URL should reflect the new workspace
Gherkin is meant to be readable by non-developers. The hidden value: forcing the team to express behavior in language all participants understand.
Why Most BDD Adoption Fails
Common failure pattern:
Team adopts Cucumber because BDD is a thing
Developers write Gherkin features for the same things they'd test anyway
PM and QA don't actually engage with the features
Tests become harder to maintain because they're in two languages (Gherkin + glue code)
Team concludes BDD doesn't work
The failure isn't BDD. The failure is adopting the syntax without the practice.
If business stakeholders aren't reading the Gherkin, you have no reason to use Gherkin. Just write tests in your normal language.
When BDD Pays Off
BDD has real value when:
Business rules are complex and contested. Examples surface disagreements that abstract specs don't.
Stakeholders genuinely participate. PM or business analyst reads, edits, and owns features.
Regulatory or contractual specifications exist. Examples map directly to requirements.
Cross-team coordination is high. Shared examples create shared understanding.
It doesn't pay off when:
The team works on internal tools without external stakeholders
The same developers are writing specs and tests
The product is technical with no business-rule complexity
Adoption is for adoption's sake
The Real Practice
Whether or not you use Cucumber-style tools, the underlying practice is valuable:
Talk through examples before coding. Concrete examples reveal ambiguity that abstract discussion doesn't.
Write the examples down. Capture them in code, docs, or both.
Verify the examples. Tests assert the example outcomes.
Maintain the examples. They stay current with the product.
The first step is the biggest. Many teams skip it and dive straight to coding, then test what they coded. BDD inverts: discuss the behavior first, then build it.
Specification by Example
A closely related practice: Specification by Example (Gojko Adzic's framing). Same core idea — use examples to drive specification.
Differences:
BDD often associated with specific tools (Cucumber, SpecFlow)
Specification by Example is tool-agnostic
In practice, the two are similar enough that the terms get used interchangeably. The core discipline is the same.
When to Use Gherkin (vs. Plain Tests)
Use Gherkin when:
Multiple non-developer stakeholders read and edit specifications
You need an executable specification that audits well
The product has substantial business rule logic
Cross-team or vendor coordination depends on shared specs
Stick with plain tests when:
Tests are written and read by developers
The product is mostly technical
The overhead of Gherkin slows the team without adding clarity
Honest assessment: most teams don't need Gherkin. They benefit from the underlying BDD practice, not the syntax.
Step Definitions
In Cucumber-style BDD, Gherkin steps map to code (step definitions). This is where teams hit maintenance pain.
@given("the user is signed in")
def step_user_signed_in(context):
context.user = create_user()
context.session = sign_in(context.user)
Issues:
Step definitions and Gherkin can drift
Generic steps lead to overly-flexible glue code
Specific steps lead to step-per-test proliferation
Maintenance is a known issue for teams using BDD tools at scale
If you adopt Gherkin, budget for the step-definition maintenance. It's not free.
Examples vs. Edge Cases
A common confusion: examples are not exhaustive cases.
Examples are a few concrete scenarios that anchor understanding. They're not the same as comprehensive test coverage.
A feature might have:
3-5 example scenarios (the BDD artifact)
30-50 detailed tests covering edge cases (the test suite)
The examples drive design and create shared understanding. The detailed tests catch regressions. Both have a place.
BDD and Test-Driven Development
BDD and TDD are different things often conflated.
TDD: write a test, see it fail, write code, see it pass. A development cycle.
BDD: collaborate on examples that drive understanding. A specification practice.
They overlap (BDD examples often become tests; TDD tests can drive design discussions) but the focus differs. TDD is for engineers; BDD is across roles.
A Pragmatic Approach
For teams that want the BDD value without the tool overhead:
Run three-amigos sessions for important features. PM, developer, tester collaborate on examples.
Capture examples in a shared doc (or PR comments) initially.
Convert valuable examples into normal tests in the team's preferred framework.
Skip Gherkin unless stakeholders are reading and writing it.
This captures the practice value without the tooling overhead.
When to Try Gherkin Anyway
Some contexts where Gherkin pays off:
Heavily regulated products where specs need to be in domain language
Vendor-customer relationships where shared specifications matter
Domains with complex business rules and active business stakeholder involvement
In those contexts, the upfront tooling investment can pay back. Outside them, it usually doesn't.
Anti-Patterns
Gherkin without collaboration. Developers write specs alone; no stakeholders read them. Just write tests.
Imperative Gherkin. Steps describe button clicks instead of behavior. Defeats the purpose.
Step definition explosion. Each test has its own steps. Maintenance burden grows linearly.
Specs as ceremony. Specs written after the code, to comply with a process. No value, all cost.
Feature flood. 500 feature files; no one knows which are current.
Key Takeaway
BDD is a practice of using examples to drive shared understanding, not a tool or syntax. The tools (Cucumber, Gherkin) are valuable when non-developers actually read and write specifications; they're overhead otherwise. The underlying practice — three-amigos sessions, concrete examples before coding — is broadly useful. For most teams, capture examples in normal tests and skip Gherkin. Adopt Gherkin only when the stakeholder collaboration justifies the tool investment.
Related reading
Keep learning. This article is part of the Software Testing Foundations path in the ShiftQuality Learning Center. Learn to design tests that catch real bugs.


