Internal Developer Platforms: What to Build First

ShiftQuality Contributor
Feb 19
5 min read

The previous posts in this path covered platform engineering as a discipline and developer experience measurement. This post covers the practical question that every platform team faces on day one: what do we actually build?

The temptation is to start with the shiny thing — a developer portal with a service catalog, self-service provisioning, environment management, and a dashboard that shows everything. This takes a year to build, satisfies nobody, and gets abandoned when developers continue using the tools they already know.

The teams that build successful internal developer platforms start differently. They find the single biggest friction point in the development workflow and remove it. Then they find the next one. The platform emerges from a series of solved problems, not from a grand design.

Finding the First Problem

The first problem is the one that wastes the most developer time across the most people. It is rarely the problem that platform engineers find most interesting.

Talk to developers. Not in a survey with checkboxes — in conversations where you watch them work. What do they complain about? What takes them 30 minutes that should take 30 seconds? Where do they wait? Where do they copy-paste? Where do they ask Slack for help with something that should be self-service?

Common first problems: environment provisioning takes days instead of minutes. CI builds take 40 minutes. Deploying to staging requires filing a ticket. Setting up a new service means copying an existing one and manually editing 15 files. Database migrations require a DBA to run them. Each of these is a specific, solvable problem that affects real developers every day.

Resist the urge to solve all of them simultaneously. Pick one. Solve it well. Ship it. Measure the impact. Use that impact to justify the investment in solving the next one.

Golden Paths, Not Golden Cages

A golden path is the recommended, well-supported way to accomplish a common task — creating a new service, setting up a CI pipeline, deploying to production. The golden path is paved, lit, and maintained. Developers who follow it get fast results with minimal friction.

The golden path is not a mandate. Developers who need to deviate — because their use case is genuinely different — are free to do so. They just do not get the paved road. They are responsible for their own tooling and support.

This distinction is critical for adoption. Developers who are forced to use a platform resent it and find workarounds. Developers who choose the platform because it is genuinely easier adopt it enthusiastically and become advocates. The platform succeeds by being better than the alternative, not by being the only option.

The implementation: start with a service template. A single command that creates a new service with the standard project structure, CI pipeline, deployment configuration, monitoring setup, and documentation skeleton. This template embodies the golden path — it encodes the team's best practices into a starting point that is easy to use and easy to maintain.

Self-Service as a Design Principle

The value of an internal developer platform is self-service — developers can accomplish tasks without filing tickets, waiting for other teams, or asking for help. Every interaction that requires a human intermediary is a friction point that the platform should eventually eliminate.

Self-service for environment provisioning: a developer requests a staging environment through a CLI or UI, and it is provisioned within minutes, configured with the right dependencies and connected to the right services.

Self-service for secrets management: a developer adds a new secret through a self-service interface, the secret is stored securely, and the application can access it without any manual configuration by an operations team.

Self-service for database creation: a developer requests a new database instance for their service, it is provisioned with the standard configuration, backup policy, and monitoring, and the connection string is automatically injected into the service's configuration.

Each self-service capability replaces a ticket, a wait, and a human intermediary. The cumulative effect is that developers spend time building features instead of waiting for infrastructure.

The safety requirement: self-service does not mean unrestricted. Guardrails — cost limits, security policies, compliance requirements — are built into the self-service workflows. A developer can provision a staging database but cannot provision a production database without approval. The guardrails are automatic, not manual — the platform enforces them, not a human reviewer.

Build vs. Integrate

The internal developer platform does not need to be built from scratch. Many components already exist as tools, services, and open-source projects. The platform team's job is often integration — connecting existing tools into a coherent experience — rather than building new capabilities.

CI/CD: use an existing system (GitHub Actions, GitLab CI, Azure DevOps) and standardize the pipeline templates. Infrastructure provisioning: use Terraform or Pulumi with pre-built modules. Monitoring: use existing observability platforms (Datadog, Grafana, New Relic) with standardized dashboards and alert configurations. Secret management: use the cloud provider's secret store with a thin abstraction layer.

The platform team builds the glue — the templates, configurations, abstractions, and self-service interfaces that connect these tools into a coherent developer experience. This approach delivers value faster than building from scratch and leverages the maturity and community support of established tools.

Build custom only when no existing tool meets the need. A custom deployment orchestrator is rarely justified. A custom abstraction that simplifies Kubernetes deployment for your specific use case often is — because the abstraction encodes your organization's specific patterns and eliminates the complexity that is irrelevant to your developers.

Measuring Platform Success

A platform that the platform team thinks is great but developers do not use is a failed platform. Adoption is the primary success metric — what percentage of teams use the platform's capabilities? What percentage of new services are created through the platform's templates?

Beyond adoption: developer satisfaction (do developers find the platform helpful?), lead time (has the time from code commit to production decreased?), self-service rate (what percentage of infrastructure requests are handled through self-service vs. tickets?), and onboarding time (how quickly can a new developer become productive?).

Track these metrics over time. A platform that improves lead time by 40% in its first year has demonstrated clear value. A platform that has 30% adoption after a year has a product problem — either the platform does not solve the right problems, or it solves them in a way that developers find harder than the alternative.

The Takeaway

Internal developer platforms succeed when they start with real friction, deliver self-service capabilities that are easier than the alternatives, and grow incrementally based on developer feedback. They fail when they start with a grand vision, build for months without shipping, and mandate adoption instead of earning it.

Build the first thing that removes the most friction. Ship it. Measure the impact. Build the next thing. The platform emerges from solved problems, not from architecture diagrams.

Next in the "Platform Engineering" learning path: We'll cover platform as a product — applying product management principles to internal developer platforms to ensure they serve their users effectively.