What stage are you at, and what should you do next?
A self-assessment across five engineering-specific dimensions. Takes 3 minutes. Produces a stage-appropriate investment recommendation.
Most AI adoption maturity models were designed for strategy consultants, not engineering leaders making quarterly investment decisions. They tell you what "mature" looks like but not what to do on Monday.
This assessment adapts the SEI/CMU AI Adoption Maturity Model for engineering-specific context. Five dimensions. Five levels. Each dimension maps directly to an infrastructure investment, which maps to a concrete action item for your platform team.
The goal is not to get a good score. The goal is to find your real level on each dimension and understand which one is your current bottleneck.
Self-assessment
Score your org across 5 dimensions
AI Engineering Maturity Assessment
Score yourself on 5 dimensions. Takes about 3 minutes.
How do your agents get organizational knowledge?
Maturity model
Five stages, engineering specific
These aren't just descriptions. Each stage has a characteristic failure mode and a high-leverage next investment.
Tool Adoption
Engineers use AI coding tools individually. No shared workflow, no consistent context, no coordination infrastructure.
"We bought seats for everyone."
Team Workflows
Some teams have structured AI workflows. Inconsistent across the org. Early context files, informal governance.
"Some teams are doing well. Others not so much."
Governed Execution
Org-wide policies exist. Governance roles are defined. Verification gates in CI. Context is managed at the team level.
"We have a policy. We enforce it in CI."
Orchestrated Delivery
Multi-agent workflows are coordinated. Shared context layer. Org-wide observability. Can trace any change from request to production.
"We can trace any AI-generated change end to end."
Adaptive Autonomy
Risk-calibrated automation. System learns from outcomes. Humans involved only at high-judgment decision points.
"The system improves itself from production data."
Stage-appropriate investments
What to build next, by level
Most orgs try to build Level 4 infrastructure when they're at Level 2. That's backwards. Match the investment to the actual bottleneck.
From Tool Adoption to Team Workflows
2–4 weeksHigh-leverage investments
Create a shared CLAUDE.md or AI context file, one for the whole eng team, not per engineer
Define your first explicit AI workflow: what gets AI assistance, what doesn't, how it's tagged
Add basic PR labeling for AI-generated code so you can start measuring
Run one team retro specifically about AI coordination friction
Don't yet
Don't invest in governance tooling yet. You don't have enough signal to know what to govern.
From Team Workflows to Governed Execution
4–8 weeksHigh-leverage investments
Write your first governance policy: which change types require review, which can auto-merge
Implement risk classification in CI: route AI PRs touching critical services to senior reviewers
Build an eval harness: deterministic checks (types, lint, tests) on AI-generated PRs before they hit review queue
Define your org's decision tiers: what decisions stay human-owned
Don't yet
Don't try to automate review decisions you can't consistently make manually. Codify what you already know, not what you wish you knew.
From Governed Execution to Orchestrated Delivery
8–16 weeksHigh-leverage investments
Build or deploy a shared organizational context layer, one that all agent sessions can access
Define structured multi-agent workflows for your highest-volume task types
Implement end-to-end observability: trace from ticket to production with AI attribution
Add LLM-as-judge scoring on top of deterministic eval gates
Don't yet
Don't try to coordinate all agent work simultaneously. Pick one workflow, get it right, then replicate.
From Orchestrated Delivery to Adaptive Autonomy
6–12 monthsHigh-leverage investments
Build feedback loops: production outcomes feed back into context layer and governance policy
Implement risk-calibrated autonomy: autonomy level calibrated dynamically by change risk score
Add anomaly detection on governance health metrics
Define your human authority boundary precisely: what decisions stay human-owned at every stage gate
Don't yet
Level 5 without strong Level 4 foundations is theater. Make sure you can trace every change before you automate more of them.
Leader artifact
Questions to hand to your platform team
These are the questions that will surface the real state of your AI engineering infrastructure. Not the state you think you're in, but the state that shows up in the data. Bring the answers to your next platform review.
Context
Context architecture
Can you show me one place where all agents get the same organizational context? If I asked our frontend and platform agents to both create an API endpoint, would they follow the same conventions?
Governance
Policy infrastructure
Is our AI governance policy written down? Is it enforced programmatically, or does it depend on every engineer remembering to apply it? What's the failure mode if someone doesn't follow it?
Verification
Eval harness
What validation happens on an AI-generated PR before a human sees it? Is there anything that would catch a subtle architectural mistake before it enters the review queue?
Coordination
Workflow structure
If two engineers' agents both started working on overlapping parts of the codebase right now, when would we find out? At merge time? At code generation time? Or before the agents start?
Observability
Attribution and tracing
Can you show me the last 10 changes that went to production and tell me which ones were AI-generated, who reviewed them, and whether any of them caused incidents?
Metrics
Baseline data
What's our AI-generated PR acceptance rate? What's our review queue depth trend over the last 6 months? Can we correlate AI adoption level with delivery frequency by team?
A note on this model
Why levels aren't linear targets
Maturity models create a pull toward Level 5 as the obvious goal. That's not the right frame. A 30-engineer startup doesn't need Level 4 orchestration infrastructure. A heavily regulated enterprise can't operate at Level 5 without significant governance investment first.
The right question is: "What's the highest-leverage bottleneck at my current level?" If you're at Level 1 on governance and Level 3 on verification, adding more verification infrastructure won't move your delivery metrics. Governance is the constraint.
Find your lowest-scoring dimension. That's where to invest next. The module index in the hub page maps each dimension to the relevant playbook modules.
Go deeper
Related reading
Blog article
Adaptive Autonomy: Why Fully Autonomous AI Is the Wrong Goal
CO-BUILD PROGRAM
From playbook to production
We work directly with engineering leaders who are making this transition now. You bring the real constraints; we help you build the coordination layer around them.