Skip to main content

Platform Engineering: Stop Making Every Team Reinvent the Wheel

February 19, 2026

Every company reaches a point where teams spend more time fighting infrastructure than building features. Platform engineering fixes this by building an internal developer platform — a paved road that makes the right way the easy way.

The Problem Platform Engineering Solves

Without a platform team, every product team independently figures out:

Team A: "How do we deploy to Kubernetes?"
Team B: "How do we deploy to Kubernetes?"
Team C: "How do we deploy to Kubernetes?"
Team D: "How do we deploy to Kubernetes?"

Result: 4 different deployment pipelines, 4 different patterns,
        4 times the maintenance burden

Each team builds their own:

  • CI/CD pipelines
  • Monitoring setup
  • Secret management
  • Database provisioning
  • Service mesh configuration
  • Logging infrastructure

This is cognitive load — the mental overhead of managing infrastructure that has nothing to do with your product.

What Is an Internal Developer Platform?

An IDP is a self-service layer on top of your infrastructure. Developers interact with the platform, not directly with Kubernetes, Terraform, or AWS.

┌─────────────────────────────────────────┐
           Developer Experience          
  (CLI, Portal, API, Templates)          
├─────────────────────────────────────────┤
          Platform Services              
  (Deploy, Monitor, Scale, Secure)       
├─────────────────────────────────────────┤
          Infrastructure                 
  (Kubernetes, AWS, Terraform)           
└─────────────────────────────────────────┘

Developers see the top layer. The platform team manages everything below.

Before vs After

TaskWithout PlatformWith Platform
Deploy a serviceWrite Dockerfile, Helm chart, CI pipeline, ingress rulesplatform deploy
Create a databaseFile a ticket, wait 3 days, get credentialsplatform db create --type postgres
Add monitoringLearn Prometheus, write dashboards, configure alertsAutomatic — comes with every service
Rotate secretsManual process, SSH into serversplatform secrets rotate
Spin up a new service2-3 days of boilerplateplatform service create my-api

The Golden Path

A golden path is an opinionated, supported way to do something. It's not the only way — it's the recommended way.

# golden-path/service-template/platform.yaml
kind: Service
metadata:
  name: payment-api
  team: payments
  tier: critical
spec:
  language: typescript
  framework: fastify
  runtime:
    replicas: 3
    cpu: "500m"
    memory: "512Mi"
  database:
    type: postgres
    size: small
  monitoring:
    alerts: true
    dashboard: true
    slo: 99.9
  deployment:
    strategy: rolling
    canary: true
    rollback: automatic

From this single file, the platform provisions:

  • A Kubernetes deployment with 3 replicas
  • A PostgreSQL database with backups
  • Prometheus metrics and Grafana dashboards
  • Alert rules based on SLOs
  • A CI/CD pipeline with canary deploys
  • TLS certificates and ingress
  • Structured logging shipped to your log aggregator

The developer writes one file. The platform handles the rest.

Building Blocks

1. Service Catalog (Backstage)

Backstage (by Spotify, now CNCF) is the standard for internal developer portals:

// catalog-info.yaml — registered in Backstage
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-api
  description: Handles payment processing
  annotations:
    github.com/project-slug: org/payment-api
    pagerduty.com/service-id: P12345
    grafana/dashboard-selector: payment-api
  tags:
    - typescript
    - fastify
    - payments
spec:
  type: service
  lifecycle: production
  owner: team-payments
  dependsOn:
    - resource:postgres-payments
    - component:user-api

Backstage gives you:

  • A searchable catalog of all services, APIs, and infrastructure
  • Ownership tracking (who owns what?)
  • Tech docs alongside the service
  • Software templates for creating new services
  • Plugin ecosystem (PagerDuty, GitHub, Kubernetes, etc.)

2. Software Templates

Instead of copying boilerplate from an old project, use templates:

# backstage-template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: typescript-service
  title: TypeScript Microservice
  description: Create a new TypeScript service with all platform integrations
spec:
  parameters:
    - title: Service Details
      properties:
        name:
          type: string
          description: Service name
        team:
          type: string
          description: Owning team
        database:
          type: string
          enum: [none, postgres, redis]
          default: none

  steps:
    - id: scaffold
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          team: ${{ parameters.team }}

    - id: create-repo
      action: publish:github
      input:
        repoUrl: github.com?owner=org&repo=${{ parameters.name }}

    - id: register
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps.create-repo.output.repoContentsUrl }}

A developer clicks "Create Service" in Backstage, fills in 3 fields, and gets:

  • A GitHub repo with boilerplate code
  • CI/CD pipeline configured
  • Kubernetes manifests generated
  • Monitoring pre-configured
  • Service registered in the catalog

Time to first deploy: 10 minutes instead of 2 days.

3. Infrastructure as Code

The platform team abstracts infrastructure behind simple interfaces:

# What the platform team manages (Terraform)
module "service" {
  source = "./modules/platform-service"

  name        = var.service_name
  team        = var.team
  tier        = var.tier
  replicas    = var.replicas
  database    = var.database_config
  monitoring  = var.monitoring_config
}

# This module internally handles:
# - EKS namespace
# - IAM roles
# - RDS instance
# - Security groups
# - DNS records
# - Certificate
# - Prometheus ServiceMonitor
# - Grafana dashboard
# - PagerDuty integration

Developers never see Terraform. They interact with the platform abstraction.

4. Developer CLI

A CLI that wraps platform operations:

# Create a new service
platform create service my-api --lang typescript --db postgres

# Deploy
platform deploy --env staging
platform deploy --env production --canary 10%

# Promote canary to full rollout
platform deploy promote

# Rollback
platform deploy rollback

# Manage databases
platform db create --type postgres --size medium
platform db connect my-api-db
platform db backup my-api-db

# Manage secrets
platform secrets set API_KEY=sk-abc123
platform secrets list
platform secrets rotate --all

# View logs
platform logs my-api --env production --since 1h

# Check service health
platform status my-api

Every command is a wrapper around Kubernetes, AWS, Terraform, etc. The developer never needs to learn those tools directly.

5. CI/CD Pipeline

A standardized pipeline that every service uses:

# .github/workflows/platform.yml
# Generated automatically by the platform template
name: Platform CI/CD

on:
  push:
    branches: [main]
  pull_request:

jobs:
  build:
    uses: org/platform-workflows/.github/workflows/build.yml@v2
    with:
      language: typescript
      node-version: 22

  test:
    needs: build
    uses: org/platform-workflows/.github/workflows/test.yml@v2
    with:
      language: typescript

  security:
    needs: build
    uses: org/platform-workflows/.github/workflows/security.yml@v2

  deploy-staging:
    needs: [test, security]
    if: github.ref == 'refs/heads/main'
    uses: org/platform-workflows/.github/workflows/deploy.yml@v2
    with:
      environment: staging

  deploy-production:
    needs: deploy-staging
    uses: org/platform-workflows/.github/workflows/deploy.yml@v2
    with:
      environment: production
      strategy: canary

Teams don't write CI/CD pipelines. They inherit them from the platform. When the platform team improves the pipeline (adds security scanning, speeds up builds), every team benefits automatically.

Measuring Platform Success

DORA Metrics

MetricBefore PlatformAfter Platform
Deployment frequencyWeeklyMultiple times daily
Lead time for changes2 weeks< 1 day
Change failure rate15%< 5%
Mean time to recovery4 hours< 30 minutes

Developer Satisfaction

Survey your developers regularly:

"How easy is it to deploy a new service?"
"How much time do you spend on infrastructure tasks?"
"Do you feel productive?"
"What's your biggest pain point?"

If developers are still fighting infrastructure, the platform isn't doing its job.

Adoption Rate

Track how many teams use platform features:

Service template usage:    85% of new services
Standard CI/CD pipeline:   92% of repos
Platform CLI daily users:  73% of developers
Self-service database:     68% of new databases

Low adoption means the platform is too complex or doesn't solve real problems. The best platform is one developers choose to use, not one they're forced to use.

Common Mistakes

1. Building Too Much, Too Early

Bad:  Build a complete platform for 18 months, launch it all at once
Good: Start with the biggest pain point, ship in 2 weeks, iterate

Start with what hurts most. Usually it's one of:

  • Deploying a service takes too long
  • Creating a new service is painful
  • Monitoring is inconsistent

Fix that first. Then expand.

2. Not Treating It as a Product

Your platform is an internal product. Your developers are your users. This means:

  • Talk to your users (developers) regularly
  • Prioritize based on their pain points
  • Write documentation
  • Provide support
  • Measure satisfaction

3. Forcing Adoption

Bad:  "All teams must migrate to the platform by Q3"
Good: "The platform makes deployment 10x faster — teams are
       migrating because they want to"

If you have to force adoption, your platform doesn't solve real problems.

4. Ignoring the Developer Experience

Bad:  platform deploy --cluster prod-us-east-1 --namespace payments \
      --image registry.internal/payment-api:sha-abc123 \
      --replicas 3 --strategy rolling --max-surge 1

Good: platform deploy

Sane defaults. Minimal configuration. Progressive disclosure — simple by default, powerful when needed.

Platform Team Structure

A typical platform team:

Platform Team (4-8 engineers)
├── Infrastructure (Kubernetes, cloud, networking)
├── Developer Experience (CLI, portal, templates)
├── CI/CD (pipelines, build systems)
└── Observability (monitoring, logging, alerting)

What the Platform Team Does NOT Do

  • Build product features
  • Own application code
  • Make product decisions
  • Deploy services for other teams (self-service!)

The platform team builds the tools. Product teams use the tools.

Getting Started

Week 1-2: Understand the Pain

# Interview 5-10 developers
# Ask:
# - What's your biggest infrastructure pain point?
# - How long does it take to deploy?
# - What do you wish was easier?
# - What do you spend time on that feels wasteful?

Week 3-4: Build the First Thing

Pick the highest-impact, lowest-effort improvement. Common starting points:

Option A: Standardized CI/CD pipeline (reusable workflows)
Option B: Service creation template (Backstage or Cookiecutter)
Option C: Developer CLI for common operations

Month 2-3: Expand

- Add monitoring to the golden path
- Create a service catalog
- Automate database provisioning
- Add security scanning to CI

Month 4+: Iterate

- Measure adoption and satisfaction
- Fix what's not working
- Add what's most requested
- Remove what nobody uses

Quick Reference

ComponentTool Options
Developer PortalBackstage, Port, Cortex
CI/CDGitHub Actions, GitLab CI, Dagger
InfrastructureTerraform, Pulumi, Crossplane
KubernetesArgoCD, Flux, Helm
MonitoringPrometheus + Grafana, Datadog
LoggingLoki, ELK, Datadog
SecretsVault, AWS Secrets Manager, SOPS
Service MeshIstio, Linkerd, Cilium

Summary

Platform engineering is about removing friction:

  1. Golden paths — Opinionated, supported ways to build and deploy
  2. Self-service — Developers provision what they need without tickets
  3. Abstractions — Hide infrastructure complexity behind simple interfaces
  4. Consistency — Every service gets monitoring, security, and CI/CD by default
  5. Measurement — Track DORA metrics, developer satisfaction, and adoption

The goal isn't to build the most sophisticated platform. It's to build the platform that makes your developers most productive. Start small, solve real pain, and iterate.

Recommended Posts