Six months. That is how long it took to build MetaScope, a native macOS metadata editor that is now on the Mac App Store. One founder, no team, and a professional-grade product shipped on a schedule most studios would laugh at.
This is not a story about how fast AI can write code. It is a story about how we stopped letting AI cost us speed by producing plausible-looking work that we then had to fix. The system we built to solve that is called Armature, and it is the same system we bring to every acceleration engagement today.
The problem with “capable but chaotic”
Early in the MetaScope project, a pattern emerged. Claude could write excellent Swift. It understood SwiftUI, it knew macOS conventions, and it could handle obscure ExifTool quirks. And yet:
- Decisions evaporated between sessions. We would agree on an architectural pattern on Tuesday. By Thursday, the same conversation was happening again.
- Quality was inconsistent. Some commits were clean. Others had force unwraps, missing error handling, and undocumented APIs.
- Documentation drifted. The code changed. The help content did not. Release notes missed updates.
- Scope crept. “While we are here, let’s also…” turned a two-day feature into a week.
These are not AI problems. They are engineering discipline problems. They happen on human teams too. The difference is that AI compounds them, because it can produce volume faster than a single reviewer can catch drift.
The fix was not a smarter model. The fix was a framework around the model.
What Armature actually is
Armature is not a tool. It is a working system, a set of agreements, documents, agents, and gates that turn AI-assisted development into something closer to how a disciplined engineering team operates. Four pillars hold it up:
- A hub document (
CLAUDE.md) that defines the project’s gates and rules - Specialist agents with narrow, clear responsibilities and automatic triggers
- Persistent memory that survives across sessions
- A skill guide that helps any agent navigate the codebase quickly
The rest of this piece walks through each piece, with the concrete shape it took on MetaScope.
Pillar 1: the hub
Every project needs a central operating document. Not a README (that is for people reading GitHub), but an operational playbook the AI reads first.
MetaScope’s hub defined its non-negotiables up front:
## Mandatory gates (blocking)
### Planning gate
Document a plan BEFORE any code changes.
| Scope | Document |
|---------------------|------------------------------|
| Major (>5 days) | RFC in docs/developer/ |
| Medium (2-5 days) | Implementation plan |
| Small (<2 days) | Commit message plan |
### Quality gate
- No force unwraps in production code
- Error cases handled with user feedback
- Public APIs documented
- No TODO without an issue reference
### Testing gate
Tests MUST pass before commits.
### PR review gate
All review feedback addressed before merge.
These are not suggestions. The AI is instructed to refuse commits that violate them, to escalate uncertainty, and to ask rather than assume.
Pillar 2: the team
Claude Code supports specialist subagents. Instead of asking one model to do everything, MetaScope uses ten:
| Agent | Role | When it runs |
|---|---|---|
| plan-architect | Produces implementation plans | Before any medium or larger feature |
| code-validator | Checks quality gates | Before every commit |
| github-commit-agent | Stages, commits, pushes | After validation passes |
| pr-review-agent | Senior code review | Before merge |
| documentation-maintainer | Keeps docs synchronized | After feature completion |
| milestone-tracker | Updates project plans | After feature completion |
| macos-swift-architect | Architecture guidance | When design questions arise |
| macos-swift-debugger | Debugging expertise | When stuck past one attempt |
| code-auditor | Pattern consistency | Before major refactors |
| release-notes-generator | Documents releases | After feature completion |
The critical detail is automatic invocation. The validator is not something we remember to call. It runs before every commit. The documentation maintainer triggers when a feature lands. Discipline happens without willpower.
The chain looks like this:
code-validator → [milestone-tracker | doc-maintainer] → commit-agent → pr-review-agent → merge
It mirrors how a professional team operates, with none of the meetings.
Pillar 3: memory
A conversation that forgets everything on reload is not a teammate. It is an intern you rehire every morning. Two mechanisms solved this for MetaScope.
A persistent knowledge graph (via an MCP memory server) captures decisions, patterns, and architectural choices. New sessions begin by pulling relevant context. Any decision worth keeping gets written to the graph, not just discussed.
Session checkpoints get written before context compaction, capturing the exact state of work in progress (branch, last commit, completed items, key decisions, next steps). The next session reads the checkpoint and picks up cleanly.
No decision is truly lost. Nothing important has to be re-derived.
Pillar 4: the skill guide
AI navigates unfamiliar codebases by reading files. If the files are well-organized, it moves fast. If they are not, it wastes tool calls on archaeology.
MetaScope has a skill guide, a curated map of the repo: where code lives, what patterns to follow, how systems connect, what decisions were made and why. When a new capability enters the project (a subagent, an MCP server, a refactor), the guide updates.
This is the single highest-leverage document in the repo. Get it right and velocity compounds.
What this looks like in a real feature
A concrete walkthrough: adding batch watermarking to MetaScope.
Planning. The plan-architect agent produces a short, specific plan: goal, approach, phases with time estimates, dependencies, tests, acceptance criteria. Reviewed, adjusted, and only then does implementation begin.
Implementation. Each phase goes through the code-validator, which returns something like:
## Validation Report
Status: FAIL
Blocking
- Force unwrap at WatermarkEngine.swift:47, use guard let
- Missing tests for watermark positioning
Non-blocking
- WatermarkConfiguration missing /// documentation
Ready to commit: NO
The commit is blocked until the issues are resolved. No exceptions.
Documentation. Phase completion triggers the documentation-maintainer: release notes updated, help content added, feature matrix synced, milestone checked off. Nothing written by hand.
Review. Before merge, the pr-review-agent performs a senior review: critical issues, important issues (should fix), suggestions, positive observations, verdict. Important issues become tracked items and get fixed before merge.
What is missing from this loop is the thing most AI-assisted projects get stuck on: me, remembering to run things. The loop runs itself.
Results after six months
MetaScope shipped as a professional metadata editor with:
- Zero production crashes traced to force unwraps
- 100+ documents kept synchronized with the code
- 12 RFCs and 11 ADRs capturing major decisions
- A comprehensive test suite that gates prevent regressions against
- 100+ sessions of continuous context, with decisions preserved
The qualitative changes matter more:
Reduced cognitive load. I do not track what documentation needs updating. I do not remember every architectural choice. The framework handles it.
Consistent quality regardless of my state. Tired, rushed, or in flow, the gates still apply. Quality stopped depending on my moment-to-moment focus.
Faster onboarding. Every new feature starts with context. The skill guide navigates the codebase. Patterns come with examples.
Knowledge preservation. Six months later, I can still reconstruct why a particular pattern was chosen. It is in memory. It is in the RFCs. It is in the decision log.
Frequently asked
Is this framework specific to macOS or Swift? The content is platform-specific. The structure (hub document, specialist agents, quality gates, persistent memory, skill guide) works on any stack. We have used the same shape on React apps, backend services, and mobile projects.
Do I need all ten agents on day one? No. Start with three: a hub document (CLAUDE.md), a code-validator, and a commit agent. Add specialists as needs emerge.
Is this just “prompt engineering” with extra steps? No. Prompt engineering is how you speak to a model in one turn. This is about the system around many turns over many sessions: what gets remembered, what gets enforced, what gets delegated, what gets documented. The prompt quality still matters. It is not where the compounding value is.
What is the relationship between Armature and MetaScope? MetaScope is a product built with Armature. Armature is the system. We sell Armature-powered engagements to other teams who need to ship serious AI-native software without accumulating technical debt at speed.
The shift this represents
This framework is not about AI writing code for us. It is about AI as a force multiplier for practices we already know work, practices most teams skip because they are tedious: planning before coding, reviewing every commit, documenting as we go, keeping quality gates that actually block.
The AI is not replacing developers. It is removing the friction that keeps us from doing what we know we should. Get the framework right, and everything else follows.
For engineers: a quick-start checklist
If you want to adopt this pattern on your own project:
- Create a
CLAUDE.mdwith your non-negotiable gates - Define a
code-validatorthat runs before every commit - Define a
github-commit-agentfor consistent commit formatting - Set up a
.claude/checkpoints/directory and write one per session - Build a skill guide that maps the repo
- Configure MCP servers for your platform (Swift, Python, your language of choice) and for persistent memory
- Document core architecture before the third feature lands
- Write your first RFC for a medium feature
- Run the loop for two weeks, then iterate based on friction
Start with structure. Add sophistication as patterns emerge.
MetaScope is a professional metadata editor for macOS, available on the Mac App Store. Want to ship serious software with the same framework? Explore Armature-powered engagements.