MODULE 0115 min read·June 2026

What stage are you at, and what should you do next?

A self-assessment across five engineering-specific dimensions. Takes 3 minutes. Produces a stage-appropriate investment recommendation.

Most AI adoption maturity models were designed for strategy consultants, not engineering leaders making quarterly investment decisions. They tell you what "mature" looks like but not what to do on Monday.

This assessment adapts the SEI/CMU AI Adoption Maturity Model for engineering-specific context. Five dimensions. Five levels. Each dimension maps directly to an infrastructure investment, which maps to a concrete action item for your platform team.

The goal is not to get a good score. The goal is to find your real level on each dimension and understand which one is your current bottleneck.

LEADER TAKEAWAY

Do this exercise twice: once yourself, and once with your platform lead. The gaps between those two scores are the most informative data points.

Self-assessment

Score your org across 5 dimensions

AI Engineering Maturity Assessment

Score yourself on 5 dimensions. Takes about 3 minutes.

How do your agents get organizational knowledge?

Maturity model

Five stages, engineering specific

These aren't just descriptions. Each stage has a characteristic failure mode and a high-leverage next investment.

Level

What it looks like

Leader signal

Tool Adoption

Engineers use AI coding tools individually. No shared workflow, no consistent context, no coordination infrastructure.

"We bought seats for everyone."

Team Workflows

Some teams have structured AI workflows. Inconsistent across the org. Early context files, informal governance.

"Some teams are doing well. Others not so much."

Governed Execution

Org-wide policies exist. Governance roles are defined. Verification gates in CI. Context is managed at the team level.

"We have a policy. We enforce it in CI."

Orchestrated Delivery

Multi-agent workflows are coordinated. Shared context layer. Org-wide observability. Can trace any change from request to production.

"We can trace any AI-generated change end to end."

Adaptive Autonomy

Risk-calibrated automation. System learns from outcomes. Humans involved only at high-judgment decision points.

"The system improves itself from production data."

Stage-appropriate investments

What to build next, by level

Most orgs try to build Level 4 infrastructure when they're at Level 2. That's backwards. Match the investment to the actual bottleneck.

Level 1 → 2

From Tool Adoption to Team Workflows

2–4 weeks

High-leverage investments

Create a shared CLAUDE.md or AI context file, one for the whole eng team, not per engineer
Define your first explicit AI workflow: what gets AI assistance, what doesn't, how it's tagged
Add basic PR labeling for AI-generated code so you can start measuring
Run one team retro specifically about AI coordination friction

Don't yet

Don't invest in governance tooling yet. You don't have enough signal to know what to govern.

Level 2 → 3

From Team Workflows to Governed Execution

4–8 weeks

High-leverage investments

Write your first governance policy: which change types require review, which can auto-merge
Implement risk classification in CI: route AI PRs touching critical services to senior reviewers
Build an eval harness: deterministic checks (types, lint, tests) on AI-generated PRs before they hit review queue
Define your org's decision tiers: what decisions stay human-owned

Don't yet

Don't try to automate review decisions you can't consistently make manually. Codify what you already know, not what you wish you knew.

Level 3 → 4

From Governed Execution to Orchestrated Delivery

8–16 weeks

High-leverage investments

Build or deploy a shared organizational context layer, one that all agent sessions can access
Define structured multi-agent workflows for your highest-volume task types
Implement end-to-end observability: trace from ticket to production with AI attribution
Add LLM-as-judge scoring on top of deterministic eval gates

Don't yet

Don't try to coordinate all agent work simultaneously. Pick one workflow, get it right, then replicate.

Level 4 → 5

From Orchestrated Delivery to Adaptive Autonomy

6–12 months

High-leverage investments

Build feedback loops: production outcomes feed back into context layer and governance policy
Implement risk-calibrated autonomy: autonomy level calibrated dynamically by change risk score
Add anomaly detection on governance health metrics
Define your human authority boundary precisely: what decisions stay human-owned at every stage gate

Don't yet

Level 5 without strong Level 4 foundations is theater. Make sure you can trace every change before you automate more of them.

Leader artifact

Questions to hand to your platform team

These are the questions that will surface the real state of your AI engineering infrastructure. Not the state you think you're in, but the state that shows up in the data. Bring the answers to your next platform review.

Context

Context architecture

Can you show me one place where all agents get the same organizational context? If I asked our frontend and platform agents to both create an API endpoint, would they follow the same conventions?

Governance

Policy infrastructure

Is our AI governance policy written down? Is it enforced programmatically, or does it depend on every engineer remembering to apply it? What's the failure mode if someone doesn't follow it?

Verification

Eval harness

What validation happens on an AI-generated PR before a human sees it? Is there anything that would catch a subtle architectural mistake before it enters the review queue?

Coordination

Workflow structure

If two engineers' agents both started working on overlapping parts of the codebase right now, when would we find out? At merge time? At code generation time? Or before the agents start?

Observability

Attribution and tracing

Can you show me the last 10 changes that went to production and tell me which ones were AI-generated, who reviewed them, and whether any of them caused incidents?

Metrics

Baseline data

What's our AI-generated PR acceptance rate? What's our review queue depth trend over the last 6 months? Can we correlate AI adoption level with delivery frequency by team?

WATCH OUT

If your platform team can't answer most of these, that's your most important finding. The inability to answer is the evidence that you're at a lower maturity level than you assumed.

A note on this model

Why levels aren't linear targets

Maturity models create a pull toward Level 5 as the obvious goal. That's not the right frame. A 30-engineer startup doesn't need Level 4 orchestration infrastructure. A heavily regulated enterprise can't operate at Level 5 without significant governance investment first.

The right question is: "What's the highest-leverage bottleneck at my current level?" If you're at Level 1 on governance and Level 3 on verification, adding more verification infrastructure won't move your delivery metrics. Governance is the constraint.

Find your lowest-scoring dimension. That's where to invest next. The module index in the hub page maps each dimension to the relevant playbook modules.

Go deeper

Where to go from here

Spec-Driven Development

How intent definition anchors everything agents build.

Harness Engineering

Building the verification and governance infrastructure.

The 90-Day Plan

Stage-appropriate action plan for your maturity level.

MODULE 00

The Coordination Bottleneck

MODULE 02

Spec-Driven Development

CO-BUILD PROGRAM

From playbook to production

We work directly with engineering leaders who are making this transition now. You bring the real constraints; we help you build the coordination layer around them.

Talk to the team Back to the playbook

What stage are you at, and what should you do next?

Score your org across 5 dimensions

AI Engineering Maturity Assessment

Five stages, engineering specific

What to build next, by level

From Tool Adoption to Team Workflows

From Team Workflows to Governed Execution

From Governed Execution to Orchestrated Delivery

From Orchestrated Delivery to Adaptive Autonomy

Questions to hand to your platform team

Context architecture

Policy infrastructure

Eval harness

Workflow structure

Attribution and tracing

Baseline data

Why levels aren't linear targets

Related reading

Where to go from here

From playbook to production