How do you coordinate multiple agents and teams without the overhead killing the gains?
Agentic engineering is not vibe coding at scale. It's a coordination problem requiring deliberate operating model design: team topology, decision authority, and agent fleet structure.
Andrej Karpathy's "agentic engineering" framing from Sequoia's AI Ascent captured something real: software engineering is shifting from engineers writing code to engineers orchestrating agents that write code. The implication most organizations have drawn is "deploy more agents."
That's the wrong implication. Deploying more agents without redesigning the operating model around them produces exactly what Module 00 describes: individual velocity up, organizational delivery flat. The problem isn't the agents. It's the absence of a coordination model designed for agent-speed execution.
Agentic engineering is the discipline of designing that operating model. It covers how you structure work so agents can execute it, how your team roles shift, what decisions stay human-owned, and how you prevent the quadratic coordination overhead from consuming the gains.
of software engineering tasks are soon within reach of AI agents
Anthropic, 2025
Multi-agent collaborative task success rate (vs 50% solo)
CooperBench, Stanford/SAP 2026
AI-written PRs merged per week at Stripe
Stripe Engineering Blog
Review load increase on senior engineers in AI-native orgs
LinearB 2026
Definition
What agentic engineering actually means
The term has been colonized by product marketing. For this playbook: agentic engineering is the practice of structuring engineering work so that AI agents can handle implementation, verification, and iteration with minimal human coordination overhead for each individual change, while maintaining human authority over architectural decisions, risk classification, and organizational direction.
The critical distinction: this is not about maximizing agent autonomy. It's about right-sizing human involvement. Humans remain in the loop on the things that require human judgment. The operating model shifts which things those are.
Karpathy's framing: "The mental model shifts from 'coder' to 'orchestrator,' defining tasks, reviewing outputs, managing context, and maintaining quality." That's the leader-level summary. The implementation involves specific choices about team structure, decision tiers, and agent fleet design.
“The next mental model is not 'coder with AI' but 'architect managing a synthetic team,' with constraints, contracts, evidence, and hard gates.”
HackerNews, discussion on 'Code Is Cheap. Coherence Is the New Bottleneck'
Team topology
How engineer roles shift
The roles don't disappear. They evolve. Understanding this shift prevents two failure modes: engineers resisting agent work, and engineers delegating too much.
Senior engineers write the most complex code. Their time is the bottleneck for hard problems.
Senior engineers define the architectural boundaries and decision criteria that agents operate within. Their time is the bottleneck for context and governance quality.
Mid-level engineers implement features from tickets. Code review is their main quality gate.
Mid-level engineers orchestrate agent work: decompose tasks, manage context, verify outputs against specs, escalate when agents deviate.
Junior engineers work on well-scoped bugs and small features while building context.
Junior engineers still work on well-scoped work, but they learn by reviewing agent output, modifying specs, and understanding why agents made the choices they made.
Platform teams build developer tools and CI/CD pipelines.
Platform teams build the agent coordination layer: shared context infrastructure, eval harnesses, governance policy, orchestration workflows. This is now core infrastructure, not a side project.
Decision authority
The decision tier framework
This is the most important artifact in agentic engineering. Without explicit decision tiers, every change requires human review, which is exactly the review debt problem from Module 00.
Autonomous
Agent acts without human review. Verified by eval harness only.Examples
- Test coverage improvements
- Documentation updates
- Dependency version bumps (passing all tests)
- Linting and formatting fixes
- Refactors with no behavior change (verified by test suite)
Required gates
- All tests pass
- No new dependencies
- No changes to public interfaces
- Diff size below threshold
Supervised
Agent acts; human reviews before merge. Risk-routed to appropriate reviewer.Examples
- New features within existing service boundaries
- Bug fixes with behavior change
- API additions (non-breaking)
- Database migrations (non-destructive)
Required gates
- All tests pass
- Risk score below threshold
- Assigned to appropriate reviewer based on domain
Escalated
Human decision required before agent starts. Architectural sign-off or security review needed.Examples
- New services or major architectural changes
- Breaking API changes
- Security-sensitive code (auth, payments, crypto)
- Schema changes affecting multiple services
Required gates
- Architecture review completed
- Security sign-off if applicable
- Rollback plan defined
Human-owned
Humans only. No agent involvement in decision or implementation.Examples
- Production access and deployment controls
- Org-wide architectural decisions
- Incident response and production debugging
- Hiring and team structure decisions
Required gates
- N/A, agent involvement is explicitly blocked
Case study
What Stripe figured out
Stripe ships over 1,300 AI-written PRs per week. That number is cited constantly. Less discussed is how: it required years of platform investment to make the agent fleet functional at that scale.
The core insight from Stripe's engineering blog: "Whether it's documentation, developer environments, or CI, we've found time and time again that our investments in human developer productivity pay dividends in the world of agents." The platform they built for human engineers (fast devboxes, 3 million automated tests, 400+ internal tools, autofixing CI) became the coordination infrastructure for their agent fleet.
The organizational structure behind this: a dedicated developer productivity team with the mandate to build infrastructure that makes both human and agent engineering faster. Not an afterthought, not a side project. Core infrastructure investment, sustained over years.
Most organizations can't replicate Stripe's 10-year investment in 90 days. But the architecture is clear: fast feedback loops, comprehensive automated verification, rich organizational context, and governance enforced in infrastructure. That's the direction. The question is how fast you move toward it.
Leader artifact
Operating model design questions
These are the questions your operating model needs to answer before you scale agent deployment. Skipping them produces coordination failures.
Structure
Who owns agent coordination?
Is there a named team or person responsible for the coordination layer (shared context, eval harnesses, governance policy)? Or is each team building independently? Decentralized agent adoption without a platform function produces fragmentation.
Authority
What stays human-owned?
Have you written down your Tier 4 list? The decisions that are human-owned regardless of agent capability? If this isn't explicit, individual engineers and agents will fill in the gap inconsistently.
Verification
What are your eval gates?
Before any agent work is considered complete, what must pass automatically? If the answer is just 'existing test suite,' your Tier 1 autonomy is limited by test coverage quality. What verification do you need to expand autonomous execution?
Context
What do agents know about your org?
If you started a new agent session right now with a task from your backlog, what organizational context would it have? Is it the same across all teams? Is it current? This is your context architecture, and it's probably not what you think it is.
CO-BUILD PROGRAM
From playbook to production
We work directly with engineering leaders who are making this transition now. You bring the real constraints; we help you build the coordination layer around them.