Demonstrated with Claude Code.
Software development is shifting from writing instructions for machines to defining structured intent for autonomous systems operating inside persistent repositories.
AI coding produces fast prototypes. It fails in real systems — missing structure, memory, and long-term state.
The same prompt produces different outputs depending on context, history, and hidden system state.
Small requirement changes break assumptions across multiple generated components and files.
Prompts do not encode architecture, constraints, system state, or evolution history.
The bottleneck is context architecture, not model capability.
The field keeps confusing these. They are not the same problem.
Not autocomplete. An agent that does the work.
Plain instructions in CLAUDE.md. Custom skills. Sub-agents you compose.
Stops to ask. Surfaces tradeoffs. Refuses destructive moves without consent.
Hooks, MCP servers, the Agent SDK — everything is a building block.
Development shifts from stateless prompting to stateful system construction inside repositories.
Repositories are memory systems,
not code storage.
Defines product intent, goals, and constraints.
# VisionBuild a regulatory reporting system for UCITS funds.
## Goals- ingest fund data from multiple sources- validate against regulatory schema- generate board-ready reports- maintain full audit trail
## Principles- every output must be traceable- no generation without spec approval- human review before submissionDefines system components and data flow.
Fund Data → Ingestion → Validation → Report Generation → Output
Components:- ingestion layer (CSV/API)- regulatory schema validator- report generator- audit trail (decisions.md)
Storage:- Postgres (fund data + audit)- decisions.md (reasoning trace)The execution layer.
Defines how the agent operates inside the repository.
# CLAUDE.md
## Rules- follow architecture.md strictly- never bypass spec files- prefer incremental changes- maintain explicit reasoning traces
## Output style- structured reasoning- no hidden assumptionsCreator of Claude Code · Anthropic · Execution, subagents, and self-improvement loops
### 1. Plan Mode Default- Enter plan mode for ANY non-trivial task (3+ steps or architectural decisions)- If something goes sideways, STOP and re-plan immediately- Write detailed specs upfront to reduce ambiguity
### 2. Subagent Strategy- Use subagents liberally to keep main context window clean- Offload research, exploration, and parallel analysis to subagents- One task per subagent for focused execution
### 3. Self-Improvement Loop- After ANY correction from the user: update tasks/lessons.md with the pattern- Write rules for yourself that prevent the same mistake- Ruthlessly iterate on these lessons until mistake rate dropsFormer Director of AI, Tesla · OpenAI founding team · Safety, simplicity, and surgical changes
## 1. Think Before Coding**Don't assume. Don't hide confusion. Surface tradeoffs.**- State your assumptions explicitly. If uncertain, ask.- If a simpler approach exists, say so. Push back when warranted.
## 2. Simplicity First**Minimum code that solves the problem. Nothing speculative.**- No abstractions for single-use code.- If you write 200 lines and it could be 50, rewrite it.
## 3. Surgical Changes**Touch only what you must. Clean up only your own mess.**- Don't "improve" adjacent code, comments, or formatting.- The test: Every changed line should trace directly to the user's request.Without enforced constraints:
CLAUDE.md turns generation into
governed execution.
System-first thinking.
def build_system(spec): assert spec.is_clear() assert spec.is_minimal()
system = init(spec)
while not system.is_stable(): system = iterate(system)
return systemEnforcement-style system thinking.
class SystemGuard { validate(action) { if (!action.intent) { throw new Error("Missing intent"); }
return this.normalize(action); }}Karpathy demands clarity before complexity.
Cherny demands enforcement before trust.
CLAUDE.md is where both principles become executable.
The harness is governed by principles, not vibes.
Code is a derived artifact of structured specifications, not prompts.
Spec → Validate → Execute → Refine
## ConstraintNever generate a migration without a rollback in decisions.md.Never add a dependency without updating architecture.md.
## GateBefore any implementation step: confirm spec is unambiguous.If not — stop and ask.All systems must first be expressed in ASCII before implementation.
User → Upload → Process → Store → Retrieve → UI
The ASCII Validation Rule:
If ASCII structure is unclear, implementation is not allowed.
Same system design. Five different experts. Five different prompts.
Each runs independently. Each finds different failures.
You are a staff backend engineer who has builthigh-availability financial data pipelines at$1B+ AUM scale.
Review this system design. Identify every flaw,scaling risk, and architectural contradiction.Be brutal. Do not soften findings.A system is invalid if it is understood from only one perspective.
Every design must be actively attacked for failure modes before implementation.
"Destroy this system design. Identify all flaws, edge cases, contradictions, and scaling risks."
Security:- No audit log for failed validations — UCITS requires record of all exceptions, not just passes- Report proceeds even if 0 positions validate — must fail-safe
Compliance:- ISIN lookup uses local cache only — stale data risk for newly listed instruments- Board report timestamp not in UTC — ambiguous for cross-border fund reporting
Infrastructure:- No retry logic for CSV ingestion — single network failure breaks the pipeline silentlyNo system executes until the builder approves the spec —
regardless of who drafted the first version.
AI writes. You approve. Then it runs.
Everything you just saw — the memory architecture, the spec contracts, the persona gates — applied to a problem this room recognises. Built spec-first. No guessing.
847 fund positions. UCITS Article 84 requires every one to be validated before a board report can be issued.
ISIN, Qty, NAV, CCYGB0002634946 10,000 £15.23 GBP ✓IE00B4L5Y983 5,000 €112.45 EUR ✓US0378331005 2,500 $174.82 USD ✓GB00B0CTWC01 800 £42.10 GBP ✗ ISIN mismatch...844 valid · 3 flaggedA board-ready compliance report with a full audit trail — every decision traceable to source.
UCITS Compliance Report — Q1 2026Fund: Carne UCITS Strategy FundNAV Date: 31 March 2026─────────────────────────────────Article 84 Status: PASS (warnings) ✓ 844 / 847 positions validated ⚠ 3 ISIN exceptions — see Annex A ✓ Single-issuer limits: compliant ✓ Eligible assets: 100% compliant─────────────────────────────────Audit trail: decisions.md rev 14Approved by: [awaiting sign-off]A fintech team needs to ingest fund data, validate against UCITS Article 84, generate a board-ready report, and maintain a full audit trail.
One command. The harness reads the spec. It follows the rules.
› analyse fund data Q1 2026 and generate UCITS regulatory board report◆ Reading CLAUDE.md, architecture.md...◆ Loading fund_data_q1_2026.csv (847 positions)◆ Running UCITS Article 84 validation... ✓ 844 positions validated ✗ 3 flagged: ISIN mismatch rows 124, 287, 401◆ Generating board report... · Writing validator.py (+89 lines) · Writing report_generator.py (+134 lines) · Writing output/q1_board_report.pdf◆ Audit trail updated in decisions.md✓ Done. Review flagged ISINs before submission.The builder writes the intent first. Claude expands it into full spec files. The builder approves before execution.
↳ Memory Architecture in action.
# Builder writes intent first — 5 lines, not 50echo "# Vision: UCITS regulatory reporting, automated" > vision.md
# Claude drafts the full spec from your intentclaude "expand vision.md into architecture.md and CLAUDE.md"
# Review the output — you own the gate.# Only after approval does execution proceed.From 5 lines of intent — a full constraint layer the whole system runs under.
↳ The repo becomes the system's memory.
# CLAUDE.md — UCITS Regulatory Reporting
## Compliance Rules- Never generate a report without validating all positions against UCITS Article 84 schema first- Flag ISIN mismatches — do not suppress or round- Every calculation must trace to a source data row- Board report must include compliance sign-off section
## GateBefore generating any output: confirm schema validation passed.If validation fails: surface all errors. Stop. Do not proceed.Human approval required before any external submission.Before coding, we force structural alignment.
↳ ASCII-First Design. No code until structure is clear.
Fund Data ↓ Ingestion ↓ Validation ↓ Report Generation ↓ Audit Trail ↓ Output
Five prompts. Five experts. Each runs independently.
You are a UK FCA-registered compliance officerwith 10 years of UCITS IV and V expertise.
Review this system design for our regulatoryreporting pipeline. Identify every compliancegap, audit failure, and regulatory risk.Be specific. Cite the relevant UCITS articles.↳ Multi-Persona Analysis. The gate before implementation.
Security:- No audit log for failed validations — UCITS requires record of all exceptions, not just passes- Report proceeds even if 0 positions validate — must fail-safe
Compliance:- ISIN lookup uses local cache only — stale data risk for newly listed instruments- Board report timestamp not in UTC — ambiguous for cross-border fund reporting
Infrastructure:- No retry logic for CSV ingestion — single network failure breaks the pipeline silentlyThe spec said: every calculation must trace to a source row.
The code does exactly that. The compliance rule became a traceable error.
def validate_ucits_positions( df: pd.DataFrame,) -> ValidationResult: """ Validates fund positions against UCITS Article 84. Generated per architecture.md specification. Every error traces to source row — per CLAUDE.md rule 3. """ errors = [] for idx, row in df.iterrows(): if not is_valid_isin(row["isin"]): errors.append( f"Row {idx}: Invalid ISIN '{row['isin']}'" ) if row["nav"] < 0: errors.append( f"Row {idx}: Negative NAV not permitted" ) return ValidationResult( passed=len(errors) == 0, errors=errors, source_rows=[e.split()[1] for e in errors], )Spec → Implementation → Critique → Refinement → Updated Memory
Every iteration makes the system smarter. The repository accumulates structured intelligence over time.
# After critique — Claude updates memoryclaude "update decisions.md with findings from persona review"
# decisions.md now holds the audit trail# Next iteration starts from an informed stateAll decisions are recorded back into markdown files to preserve system state.
Repositories accumulate structured intelligence over time through continuous system refinement.
## 2026-05-20 — Post-Critique Update
### Auth LayerDecision: Switched from stateless JWT to session tokens + Redis.Reason: Distributed queue compatibility (Security persona).
### Fund Transfer Step 3Decision: Added explicit rollback to T-1 state.Reason: Audit requirement — every mutation must be reversible.Approved by: Builder review 2026-05-20.The ideas in this talk stand on the shoulders of these.
Software is not written.
It is specified, constrained, validated, and executed through agent systems.