How do you make each AI-built feature make the next one easier?
Most codebases get harder over time. AI accelerates this if you don't build the organizational memory infrastructure that makes knowledge compound instead of decay.
Dan Shipper's Every.to guide on compound engineering describes a loop: ideate, brainstorm, plan, work, review, polish, compound. The last step is the one most teams skip. They ship the feature, start the next ticket, and the learning from the last cycle stays in an engineer's head, where it decays, or leaves when they do.
In a human-only engineering org, this is an annoying inefficiency. In an AI-native org, it's a structural failure. AI agents have no memory between sessions. If you don't codify what you learned, the next agent session starts from the same place as the last one, making the same mistakes, asking the same implicit questions, producing the same architecture violations the last session introduced.
Compound engineering is the practice of closing that loop deliberately: capturing what you learned from each cycle and making it available as context to future agents.
Why it gets harder
The compounding problem
Traditional codebases accumulate debt slowly. Engineers learn from incidents, update their mental models, and pass knowledge through onboarding, code review, and conversation. The knowledge transfer is lossy but continuous.
AI-native codebases can accumulate debt at agent speed. Every session that lacks organizational context potentially introduces architectural drift, naming inconsistencies, or pattern violations. Code churn doubled from 3.1% to 7% between 2020 and 2024. A significant fraction of that is agents regenerating code without knowing what conventions already exist.
The fix is not to slow down agents. It's to make organizational knowledge available to agents before they generate code. That requires treating organizational memory as infrastructure, not a collection of documents, but a queryable system that evolves with the codebase.
“The question isn't just 'can the AI write this function?' It can. The pressing problem now is, 'does the AI know how this function should work in the context of our system?'”
Built In, 2025
The loop
What 'compounding' looks like in practice
Ideate
Plan
Work
Review
Ship
Compound
Most teams do steps 1–5. Compound is step 6, and the one that determines whether the system gets better with use.
The compound step takes 5–15 minutes after a feature ships. It asks one question: "What did we learn from this cycle that future agents should know?"
That learning takes several forms. Sometimes it's a new coding convention ("we decided to always use X pattern for database transactions after the incident on the 14th"). Sometimes it's an architectural decision ("we chose approach A over B because of our multi-region requirements"). Sometimes it's a failure mode to avoid ("don't use the default retry wrapper for webhook handlers, it doesn't handle idempotency correctly").
All of these are context that should be available to the next agent session. Without the compound step, they stay in an engineer's head or a PR description that nobody reads.
The 80/20
What to codify vs. what to let decay
Not everything learned from a cycle is worth codifying. Focus on what agents will need most often and what would cause the most damage if they didn't know it.
Architectural decisions (always codify)
Every time you choose one architectural approach over another, write one paragraph explaining why. Not what you chose (the code shows that). Why you chose it, and what constraints made the alternatives worse. This is ADR (Architecture Decision Record) practice, but applied at the pattern level, not just the service level.
Coding conventions from incidents (almost always codify)
When an incident reveals that a common agent pattern is unsafe in your specific system, codify the constraint immediately. 'Never use eventually-consistent reads in the checkout flow' is the kind of context that prevents the same mistake from being made in 10 future sessions.
Failed approaches (codify as negative examples)
Agents don't know what didn't work. If your team tried approach X and it caused problems, documenting that as a negative example saves future agents from rediscovering the same failure. GitHub's Spec Kit has a section specifically for 'approaches we tried and why they didn't work.'
Ambient context (let decay after irrelevance)
Not everything needs to be maintained forever. Context about a specific library migration, a temporary architectural constraint, or a one-off decision for a specific feature can be time-bound. Include an expiry signal: 'This constraint applies until we complete migration X.' Old context is worse than no context. Agents will act on outdated constraints.
Memory taxonomy
From CLAUDE.md to organizational memory graph
Level 1 · Individual
Engineer context files
CLAUDE.md, Cursor rules, personal conventions. Exists at nearly every org that uses AI tools. Problem: it's per-engineer, not per-org. The knowledge is siloed. Value: good starting point for what matters.
Level 2 · Team
Shared team context
One CLAUDE.md per team or service. Shared conventions, service-specific patterns, team-level architectural decisions. Better than per-engineer. Problem: teams still have different versions, nobody maintains consistency across teams.
Level 3 · Organization
Org-wide context layer
One source of truth for org-wide standards: coding conventions, cross-service contracts, regulatory constraints, architectural principles. All teams pull from it. Maintained by platform team. This is where most orgs should target.
Level 4 · Living graph
Organizational memory graph
A continuously-updated knowledge graph ingested from GitHub, Jira, Slack, and documentation. Includes not just current state but history, rationale, and relationships between decisions. Agents query it before generating code. This is the infrastructure investment that enables Level 5 maturity.
Leader artifact
The 5-minute post-feature reflection
Copy this and add it to your PR template or post-ship checklist. It takes 5 minutes and it's the compound step.
Post-feature reflection checklist
Add to PR description or post-ship retrospective.
What architectural decision did we make that future agents should know about?
Write one sentence: 'We chose X because Y. The alternative Z was worse because W.'
Did we discover a constraint that isn't documented anywhere?
If yes, add it to the relevant service CLAUDE.md or org context file.
Did this change reveal a pattern that should become a team convention?
If yes, create a PR updating the coding conventions doc.
Did we try an approach that didn't work? Why?
Document as a negative example. Other agents (and engineers) will make the same mistake.
Is there anything about this area of the codebase that took us more than one iteration to understand?
That's context worth adding. It means agents will face the same friction.
CO-BUILD PROGRAM
From playbook to production
We work directly with engineering leaders who are making this transition now. You bring the real constraints; we help you build the coordination layer around them.