Module 2.3: Internal Developer Platforms (IDPs)
Цей контент ще не доступний вашою мовою.
Discipline Module | Complexity:
[COMPLEX]| Time: 50-60 min
Prerequisites
Section titled “Prerequisites”Before starting this module:
- Required: Module 2.1: What is Platform Engineering? — Platform foundations
- Required: Module 2.2: Developer Experience (DevEx) — Understanding DevEx
- Recommended: Experience with Kubernetes, CI/CD, or cloud platforms
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After completing this module, you will be able to:
- Design an Internal Developer Platform architecture with clear abstraction layers
- Evaluate IDP tools like Backstage, Port, and Kratix against your organization’s requirements
- Implement a service catalog that gives developers self-service access to infrastructure capabilities
- Build platform APIs that encapsulate infrastructure complexity behind simple developer interfaces
Why This Module Matters
Section titled “Why This Module Matters”You understand Platform Engineering. You can measure developer experience. Now you need to build something.
But what exactly? “Internal Developer Platform” is thrown around, but what does it actually contain? What components are required? What’s optional? Should you build or buy?
Without understanding IDP architecture:
- You’ll build a random collection of tools, not a platform
- You’ll reinvent what exists
- You’ll buy expensive tools that don’t integrate
- You’ll create more complexity, not less
This module teaches you the components, architecture, and decision frameworks for building effective IDPs.
What is an Internal Developer Platform?
Section titled “What is an Internal Developer Platform?”Definition
Section titled “Definition”An Internal Developer Platform (IDP) is a layer of tools, workflows, and self-service capabilities that sits between developers and underlying infrastructure, reducing cognitive load and enabling self-service.
┌─────────────────────────────────────────────────────────────┐│ Developers ││ ││ "I need to deploy" "Give me a database" "Show me logs" │└───────────────────────────┬─────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Internal Developer Platform ││ ││ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ││ │ Portal │ │ Deploy │ │ Infra │ │ Observe │ ││ │ (UI/API) │ │ Platform │ │ Platform │ │ Platform │ ││ └──────────┘ └──────────┘ └──────────┘ └──────────┘ ││ ││ Golden Paths | Templates | Self-Service | Guardrails │└───────────────────────────┬─────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────────┐│ Infrastructure Layer ││ ││ Kubernetes | Cloud | Databases | Networking | Security │└─────────────────────────────────────────────────────────────┘IDP vs Point Solutions
Section titled “IDP vs Point Solutions”Without IDP (Point Solutions):
Developer needs to:├── Use GitHub for code├── Configure CircleCI for builds├── Write Terraform for infra├── Manage Kubernetes YAML├── Configure Datadog for monitoring├── Use PagerDuty for alerts├── Navigate Vault for secrets└── Understand how they all connectWith IDP:
Developer needs to:├── Use the portal└── Everything else happens automaticallyThe IDP doesn’t replace these tools—it integrates and abstracts them into a cohesive experience.
IDP Components
Section titled “IDP Components”The Five Pillars
Section titled “The Five Pillars”Every IDP has five core components:
┌─────────────────────────────────────────────────────────────┐│ IDP Architecture │├─────────────────────────────────────────────────────────────┤│ ││ 1. DEVELOPER PORTAL ││ Service catalog, docs, templates, search ││ ││ 2. INFRASTRUCTURE ORCHESTRATION ││ Compute, storage, databases, networking ││ ││ 3. APPLICATION DELIVERY ││ CI/CD, GitOps, deployments, environments ││ ││ 4. SECURITY & GOVERNANCE ││ Secrets, policies, compliance, access control ││ ││ 5. OBSERVABILITY ││ Metrics, logs, traces, alerting, dashboards ││ │└─────────────────────────────────────────────────────────────┘1. Developer Portal
Section titled “1. Developer Portal”The front door to your platform. Where developers discover, interact, and get help.
Core Capabilities:
developer_portal: service_catalog: - List all services - Ownership information - Dependencies - Health status - Documentation links
templates: - New service scaffolding - Approved technology stacks - Pre-configured best practices
documentation: - Searchable docs - API references - Tutorials and guides
self_service: - Create new projects - Request resources - Manage environments
search: - Find services - Find documentation - Find ownersExample: Backstage Service Catalog
apiVersion: backstage.io/v1alpha1kind: Componentmetadata: name: order-service description: Handles order processing annotations: backstage.io/techdocs-ref: dir:. github.com/project-slug: org/order-servicespec: type: service lifecycle: production owner: team-orders providesApis: - orders-api dependsOn: - component:inventory-service - resource:orders-database2. Infrastructure Orchestration
Section titled “2. Infrastructure Orchestration”Provision infrastructure without tickets or waiting.
Core Capabilities:
infrastructure_orchestration: compute: - Kubernetes namespaces - Serverless functions - VM provisioning
databases: - PostgreSQL - Redis - MongoDB - Self-service provisioning
storage: - Object storage - Persistent volumes - Backups
networking: - Load balancers - DNS entries - Service mesh configExample: Crossplane Composition
apiVersion: database.example.org/v1alpha1kind: PostgreSQLInstancemetadata: name: orders-dbspec: parameters: storageGB: 50 version: "15" environment: production # Crossplane provisions actual RDS instance # Developer doesn't need AWS knowledge3. Application Delivery
Section titled “3. Application Delivery”Deploy code from commit to production.
Core Capabilities:
application_delivery: ci_pipeline: - Build automation - Test execution - Security scanning - Artifact creation
cd_pipeline: - Environment management - Deployment strategies - Rollback capabilities
gitops: - Declarative deployments - Drift detection - Environment promotion
environments: - Development - Staging - Production - Preview environmentsExample: Deployment Self-Service
# Developer creates thisapiVersion: platform.example.com/v1kind: Applicationmetadata: name: order-servicespec: image: order-service replicas: auto environments: - name: staging promote: auto - name: production promote: manual
# Platform creates:# - Kubernetes Deployment# - Service# - Ingress# - HPA# - PDB# - NetworkPolicy# - ArgoCD Application# - Monitoring dashboards4. Security & Governance
Section titled “4. Security & Governance”Secure by default with guardrails, not gates.
Core Capabilities:
security_governance: secrets_management: - Secret storage - Rotation - Injection
access_control: - RBAC - SSO integration - Audit logging
policy_enforcement: - Security policies - Compliance requirements - Cost guardrails
scanning: - Vulnerability scanning - License compliance - Static analysisExample: Policy as Code
# OPA/Gatekeeper policyapiVersion: constraints.gatekeeper.sh/v1beta1kind: K8sRequiredLabelsmetadata: name: require-owner-labelspec: match: kinds: - apiGroups: [""] kinds: ["Namespace"] parameters: labels: ["owner", "cost-center"]5. Observability
Section titled “5. Observability”Visibility into everything running on the platform.
Core Capabilities:
observability: metrics: - Application metrics - Infrastructure metrics - Business metrics - Dashboards
logging: - Centralized logs - Log search - Log correlation
tracing: - Distributed tracing - Request flow - Latency analysis
alerting: - Alert rules - Routing - Escalation - On-call managementExample: Auto-Generated Dashboard
# Developer adds annotationmetadata: annotations: platform.example.com/dashboard: "standard"
# Platform generates Grafana dashboard with:# - Request rate# - Error rate# - Latency (p50, p95, p99)# - Resource usage# - Dependencies healthTry This: Component Inventory
Section titled “Try This: Component Inventory”Map your organization’s current tooling to IDP components:
## IDP Component Inventory
### Developer PortalCurrent tools: _________________Gaps: _________________Satisfaction (1-5): ___
### Infrastructure OrchestrationCurrent tools: _________________Gaps: _________________Self-service level (1-5): ___
### Application DeliveryCurrent tools: _________________Gaps: _________________Satisfaction (1-5): ___
### Security & GovernanceCurrent tools: _________________Gaps: _________________Confidence level (1-5): ___
### ObservabilityCurrent tools: _________________Gaps: _________________Satisfaction (1-5): ___
### Integration ScoreHow well do these tools work together? (1-5): ___IDP Reference Architectures
Section titled “IDP Reference Architectures”Small Team IDP (< 20 developers)
Section titled “Small Team IDP (< 20 developers)”┌─────────────────────────────────────────────────────────────┐│ Small Team IDP │├─────────────────────────────────────────────────────────────┤│ ││ Portal: README + Wiki + Slack ││ ││ Infra: Terraform modules + Kubernetes ││ ││ Delivery: GitHub Actions + ArgoCD ││ ││ Security: GitHub Dependabot + Sealed Secrets ││ ││ Observability: Prometheus + Grafana + Loki ││ │└─────────────────────────────────────────────────────────────┘
Investment: 0.5-1 FTETimeline: 1-3 monthsMedium Team IDP (20-100 developers)
Section titled “Medium Team IDP (20-100 developers)”┌─────────────────────────────────────────────────────────────┐│ Medium Team IDP │├─────────────────────────────────────────────────────────────┤│ ││ Portal: Backstage (basic) ││ - Service catalog ││ - Docs ││ - Basic templates ││ ││ Infra: Crossplane or Terraform Cloud ││ - Database self-service ││ - Storage self-service ││ ││ Delivery: GitHub Actions + ArgoCD ││ - Environment promotion ││ - Preview environments ││ ││ Security: Vault + OPA + Trivy ││ - Secrets management ││ - Policy enforcement ││ - Container scanning ││ ││ Observability: Prometheus + Grafana + Jaeger ││ - Standard dashboards ││ - Distributed tracing ││ │└─────────────────────────────────────────────────────────────┘
Investment: 2-4 FTETimeline: 3-6 monthsLarge Team IDP (100+ developers)
Section titled “Large Team IDP (100+ developers)”┌─────────────────────────────────────────────────────────────┐│ Large Team IDP │├─────────────────────────────────────────────────────────────┤│ ││ Portal: Backstage (full) ││ - Service catalog + Dependencies ││ - Software templates ││ - TechDocs ││ - Search ││ - Plugins ecosystem ││ ││ Infra: Crossplane + Custom controllers ││ - Full self-service ││ - Multi-cloud abstraction ││ - Custom resources ││ ││ Delivery: Multi-tenant CI + ArgoCD + Flagger ││ - Progressive delivery ││ - Automated promotion ││ - Feature flags ││ ││ Security: Vault + OPA/Kyverno + Falco ││ - Dynamic secrets ││ - Runtime security ││ - Compliance automation ││ ││ Observability: OpenTelemetry + Vendor (Datadog/NR) ││ - Full observability stack ││ - Cost tracking ││ - SLO management ││ │└─────────────────────────────────────────────────────────────┘
Investment: 5-15 FTE (dedicated platform team)Timeline: 6-12 months initial, continuous evolutionDid You Know?
Section titled “Did You Know?”-
Spotify built Backstage because they had 2000+ services and developers couldn’t find anything. The developer portal solved “what services exist?” before solving “how do I deploy?”.
-
Platform Engineering is not new at Google. Borg (Kubernetes’ predecessor) had internal tooling that inspired many IDP concepts. The difference now is these patterns are accessible to all organizations.
-
The most successful IDPs often start with the portal, not the infrastructure. Discoverability and documentation solve immediate pain; infrastructure abstraction can come later.
-
Humanitec’s 2024 IDP benchmarking report found that enterprises with mature IDPs deploy 4x more frequently than peers—while reducing change failure rates by 50%.
War Story: The Million-Dollar Mistake
Section titled “War Story: The Million-Dollar Mistake”A company decided to build their IDP. They had budget, ambition, and a two-year timeline.
The Plan (Year 1):
- Build custom Kubernetes controllers
- Create bespoke deployment system
- Design proprietary service mesh
- Implement custom observability
The Reality (Month 6):
Progress:- Custom controllers: 40% complete- Deployment system: 30% complete- Service mesh: Research phase- Observability: Not started- Developer adoption: 0%
Developers: "When can we use it?"Platform team: "It's not ready yet"Month 12:
Budget: 150% of planTimeline: SlippingTeam morale: LowDeveloper frustration: High
Meanwhile, competitors shipped features.The Pivot:
New leadership came in and asked: “What if we bought instead of built?”
The New Approach (3 months):
- Portal: Backstage (open source)
- Infra: Crossplane (open source)
- Delivery: ArgoCD (already using)
- Security: Vault (existing)
- Observability: Datadog (paid, but works immediately)
Results:
Month 1: Backstage deployed, service catalog liveMonth 2: Crossplane database self-serviceMonth 3: First team fully on platform
Total cost: 70% less than custom buildTime to value: 3 months vs 18+ monthsDeveloper satisfaction: Up 40%What They Learned:
- Build where you differentiate, buy where you don’t: Custom deployment doesn’t give competitive advantage
- Start with open source: Reduce risk, avoid lock-in
- Time to value matters: 80% solution today beats 100% solution never
- Integration > invention: Wiring together good tools is often better than building perfect tools
The Lesson: The best IDP is the one developers actually use. Shipping something imperfect beats building something perfect that never ships.
Build vs Buy Decision Framework
Section titled “Build vs Buy Decision Framework”When to Build
Section titled “When to Build”build_when: - "Core differentiator for your business" - "No existing solution fits your needs" - "You have unique constraints (air-gapped, regulations)" - "Long-term strategic investment" - "You have the team to maintain it forever"
build_examples: - "Custom workflow specific to our domain" - "Integration layer between proprietary systems" - "Specialized security requirements"When to Buy/Use Open Source
Section titled “When to Buy/Use Open Source”buy_when: - "Standard capability (CI, monitoring, secrets)" - "Time to value is critical" - "You lack expertise to build/maintain" - "Community momentum is valuable" - "Focus engineering on product, not infra"
buy_examples: - "Kubernetes (don't build your own orchestrator)" - "Observability (Prometheus, Datadog, etc.)" - "CI/CD (GitHub Actions, CircleCI)" - "Secrets (Vault)"Decision Matrix
Section titled “Decision Matrix”| Component | Build? | Buy/OSS? | Rationale |
|---|---|---|---|
| Container orchestration | ❌ | ✅ | K8s is standard |
| Service mesh | ❌ | ✅ | Istio/Linkerd exist |
| Developer portal | Maybe | ✅ | Backstage is solid |
| CI/CD | ❌ | ✅ | Many good options |
| Custom abstractions | ✅ | ❌ | Your specific workflow |
| Observability | ❌ | ✅ | Don’t reinvent metrics |
| Integration layer | ✅ | ❌ | Your systems, your glue |
The Integration Reality
Section titled “The Integration Reality”Even “buying” requires building:
What you buy:├── Backstage├── ArgoCD├── Crossplane├── Vault└── Prometheus
What you still build:├── Backstage plugins for your tools├── ArgoCD ApplicationSets for your workflow├── Crossplane Compositions for your infra├── Vault integration with your identity├── Prometheus rules for your services└── The glue connecting everythingBudget accordingly: Buy = 30% cost. Integration = 70% cost.
IDP Adoption Strategies
Section titled “IDP Adoption Strategies”Strategy 1: Big Bang (High Risk)
Section titled “Strategy 1: Big Bang (High Risk)”Month 0: Build entire IDPMonth 6: Mandatory migrationMonth 7: Chaos
Risk: HighDisruption: MaximumSuccess rate: LowWhen it might work: Startup greenfield, no legacy
Strategy 2: Greenfield First (Recommended)
Section titled “Strategy 2: Greenfield First (Recommended)”Month 1: Build IDP MVPMonth 2: New projects use IDPMonth 3-6: Expand capabilitiesMonth 6+: Voluntary migration
Risk: LowDisruption: MinimalSuccess rate: HighWhy it works: New projects have no legacy, prove value early
Strategy 3: Pain Point First
Section titled “Strategy 3: Pain Point First”Month 1: Solve biggest pain point (e.g., slow CI)Month 2: Demonstrate valueMonth 3: Add next capabilityMonth 4+: Expand organically
Risk: LowDisruption: MinimalSuccess rate: HighWhy it works: Solve real problems, build trust
Strategy 4: Team by Team
Section titled “Strategy 4: Team by Team”Month 1: Partner with one teamMonth 2: Build for their needsMonth 3: Expand to similar teamMonth 4+: Word-of-mouth growth
Risk: LowDisruption: MinimalSuccess rate: HighWhy it works: Deep understanding, advocates emerge
Anti-Patterns
Section titled “Anti-Patterns”❌ Build for 6 months, then mandate adoption❌ Try to support every use case on day one❌ Force migration with deadlines❌ Ignore feedback from early adopters❌ Compete with teams' existing solutionsCommon Mistakes
Section titled “Common Mistakes”| Mistake | Problem | Solution |
|---|---|---|
| Building everything custom | Slow, expensive, unmaintainable | Buy commodity, build differentiators |
| No integration strategy | Collection of tools, not platform | Plan integration from day one |
| Too much too fast | Overwhelms developers | Start small, iterate |
| Mandatory adoption | Resentment, workarounds | Make it compelling, not required |
| Ignoring existing tools | Waste existing investment | Integrate before replacing |
| Under-investing in portal | Capability exists but hidden | Portal is discovery layer |
Quiz: Check Your Understanding
Section titled “Quiz: Check Your Understanding”Question 1
Section titled “Question 1”A team wants to build their own CI system. What questions should you ask?
Show Answer
Questions to challenge build decision:
-
What’s unique about your needs?
- “Standard CI” is not unique
- “Integrate with our proprietary build system” might be
-
What exists already?
- GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.
- Have you evaluated all options?
-
What’s the true cost?
- Build time + maintenance forever
- Opportunity cost of not working on product
-
What happens when the team changes?
- Tribal knowledge
- Bus factor
-
Is this a competitive advantage?
- Probably not
- Your customers don’t care about your CI
Likely recommendation: Use existing CI, build plugins/integrations for your specific needs.
When custom CI makes sense:
- Extreme scale (Google, Meta)
- Unique security requirements (defense, air-gapped)
- Already exhausted existing options
Question 2
Section titled “Question 2”You’re starting an IDP for a 50-person engineering team. What’s your first component?
Show Answer
Recommended first component: Developer Portal (service catalog + docs)
Why portal first:
- Immediate value: Developers can find things
- Low risk: Doesn’t change production systems
- Foundation for everything: Other capabilities surface here
- Quick win: Can ship in weeks
- Builds trust: Team sees platform team delivering value
What to include in MVP portal:
- Service catalog (even if manual entry)- Documentation aggregation- Link to existing tools (CI, monitoring)- Basic searchCommon mistake: Starting with infrastructure abstraction
- Higher risk (production impact)
- Longer time to value
- Developers don’t see benefit until complete
Sequence recommendation:
- Portal (visibility)
- CI/CD improvements (existing pain)
- Observability standards (help debugging)
- Infrastructure abstraction (self-service)
Question 3
Section titled “Question 3”How do you measure IDP success?
Show Answer
IDP Success Metrics (by category):
Adoption
adoption_metrics: - Percent of teams using platform - New projects on platform vs off - Voluntary migration rate - Portal daily active usersDeveloper Experience
devex_metrics: - Developer satisfaction (NPS/CSAT) - Time to first deployment - Onboarding time - Support tickets per developerEfficiency
efficiency_metrics: - Deployment frequency - Lead time for changes - Self-service rate (vs tickets) - Time to provision resourcesQuality
quality_metrics: - Change failure rate - Mean time to recovery - Security incidents - Compliance audit resultsBusiness Impact
business_metrics: - Time to market for features - Engineering velocity - Reduced duplicated effort - Cost per deploymentKey principle: Track multiple dimensions. High adoption with low satisfaction is a warning sign.
Question 4
Section titled “Question 4”Your IDP has 80% adoption but 40% satisfaction. What’s wrong?
Show Answer
This indicates forced adoption, not genuine value.
Likely causes:
-
Mandated migration
- Teams forced to use platform
- Not because it’s better
-
Missing capabilities
- Platform lacks what teams need
- Workarounds required
-
Poor developer experience
- Hard to use
- Poor documentation
- Slow response
-
One-size-doesn’t-fit-all
- Works for some teams, not others
- Insufficient customization
Investigation steps:
-
Survey dissatisfied users
- “What’s frustrating?”
- “What would you do differently?”
- “What’s missing?”
-
Watch developers use it
- Observe friction points
- Note workarounds
-
Compare satisfied vs dissatisfied
- Different team types?
- Different use cases?
Fixes:
- Address top pain points
- Add escape hatches / flexibility
- Improve documentation
- Consider making parts optional
- Invest in onboarding
Target state: High adoption AND high satisfaction. If you can’t have both, focus on satisfaction—adoption follows value.
Hands-On Exercise: IDP Architecture Design
Section titled “Hands-On Exercise: IDP Architecture Design”Design an IDP for your organization.
Part 1: Current State
Section titled “Part 1: Current State”## Current State Assessment
### Team Size & Structure- Total developers: ___- Teams: ___- Services/apps: ___- Deployment frequency: ___
### Current Tooling| Component | Current Tool(s) | Satisfaction ||-----------|-----------------|--------------|| Code | | /5 || CI/CD | | /5 || Infrastructure | | /5 || Monitoring | | /5 || Secrets | | /5 || Documentation | | /5 |
### Top Pain Points1. _________________2. _________________3. _________________Part 2: IDP Design
Section titled “Part 2: IDP Design”## IDP Architecture
### Component Selection| Component | Tool Choice | Build/Buy | Rationale ||-----------|-------------|-----------|-----------|| Portal | | | || Infra | | | || Delivery | | | || Security | | | || Observability | | | |
### Integration PointsHow will components connect?[Portal] ←→ [CI/CD] ←→ [GitOps] ↓ ↓ ↓ [Docs] [Secrets] [Infra] ↓ ↓ ↓ [Observability]
### Abstraction LayerWhat will developers interact with?
```yaml# Developer-facing resourceapiVersion: platform.yourcompany.com/v1kind: Applicationspec: # What fields do developers set? # What does platform handle?### Part 3: Roadmap
```markdown## IDP Roadmap
### Phase 1: Foundation (Months 1-3)Goal: _________________
Deliverables:- [ ] _________________- [ ] _________________- [ ] _________________
Success criteria:- _________________
### Phase 2: Core Capabilities (Months 4-6)Goal: _________________
Deliverables:- [ ] _________________- [ ] _________________- [ ] _________________
Success criteria:- _________________
### Phase 3: Scale (Months 7-12)Goal: _________________
Deliverables:- [ ] _________________- [ ] _________________- [ ] _________________
Success criteria:- _________________Success Criteria
Section titled “Success Criteria”- Assessed current state with tooling inventory
- Selected components with build/buy rationale
- Designed integration approach
- Created phased roadmap
- Defined success metrics
Key Takeaways
Section titled “Key Takeaways”- Five pillars: Portal, Infrastructure, Delivery, Security, Observability
- Buy commodity, build differentiators: Don’t reinvent CI/CD
- Start small: Portal + one pain point, not everything at once
- Integration > invention: Connecting tools is the real work
- Adoption follows value: Make it compelling, not mandatory
Further Reading
Section titled “Further Reading”Books:
- “Platform Engineering on Kubernetes” — Mauricio Salatino
- “Cloud Native Infrastructure” — Justin Garrison, Kris Nova
Architecture References:
- CNCF Platforms White Paper — CNCF Platforms Working Group
- Backstage.io — Architecture and plugins
- Crossplane — Compositions and providers
Case Studies:
- Spotify’s Backstage — Why they built it
- Zalando’s Platform — Scale lessons
- Airbnb’s Platform — Evolution story
Summary
Section titled “Summary”Internal Developer Platforms consist of five pillars:
- Developer Portal: Front door, discovery, templates
- Infrastructure Orchestration: Self-service compute, storage, databases
- Application Delivery: CI/CD, GitOps, environments
- Security & Governance: Secrets, policies, compliance
- Observability: Metrics, logs, traces, alerts
Key decisions:
- Build vs Buy: Buy commodity, build differentiators
- Start small: Portal and one pain point
- Integrate existing tools: Don’t replace unnecessarily
- Adopt incrementally: Voluntary adoption over mandates
The best IDP is invisible—developers just get their work done.
Next Module
Section titled “Next Module”Continue to Module 2.4: Golden Paths to learn how to design opinionated workflows that guide developers toward success.
“The platform should be a product so good that developers choose it, not a mandate so strict they can’t avoid it.” — IDP Wisdom