AIOS Architecture: Layers, Skills, and Memory in an AI Operating System

AIOS Architecture: Layers, Skills, and Memory in an AI Operating System
TL;DR: An AIOS separates concerns into five layers — Brain, Skills, Learnings, Context, and Services — so the AI loads only what it needs, remembers what worked, and picks up where it left off. This article covers how each layer works, what goes in it, and why the separation matters.
This is part of the AIOS guide, which covers what an AIOS is and how to build one. This article goes deeper on the architecture — the specific design decisions that make a folder of markdown files behave like an operating system.
Why Layers Matter
TL;DR: Without layers, every session dumps the entire system into the AI’s context window. That’s expensive, slow, and degrades output quality.
An AI model has a finite context window. Everything it reads — instructions, schema definitions, brand guidelines, conversation history — competes for space in that window. A system that loads everything on every session wastes context on irrelevant information and crowds out the actual work.
Layers solve this by separating concerns. The Brain layer loads every session because it defines how the system works. Skills load on demand because you only need the Airtable schema when you’re writing to Airtable. Learnings load per-command because /write doesn’t need to know what /signals learned last week.
The result: a system that stays lean in context but deep in capability. The AI always knows what it can do. It only loads the details when it needs them.
Layer 1: Brain — The Instruction Set
TL;DR: The Brain layer is a single file (CLAUDE.md) plus shared rules that the AI reads at the start of every session. It defines what the system does and how it behaves.
CLAUDE.md is the kernel. It contains:
- System purpose — one paragraph on what this AIOS does
- Architecture overview — folder structure, available commands, how layers connect
- Command reference — what each command does, in a table
- Content types — what content the system produces and where it publishes
- Status flow — how content moves from Draft to Published
- Table references — Airtable base and table IDs for quick access
- Conventions — date formats, timezone, model selection, naming patterns
Rules live in .claude/rules/ as separate files. They enforce specific behaviors across all commands:
.claude/rules/
├── airtable-writes.md # Record creation order, batch limits
└── content-creation.md # Brand loading sequence, quality requirements
Rules are the guardrails. “Always create Sources before Content Plans” is a rule. “Load brand profile before writing any content” is a rule. Without them, the AI makes reasonable but inconsistent decisions. With them, the system behaves predictably.
The Brain layer should be concise. Every token in CLAUDE.md loads on every session. If a piece of information is only relevant to one command, it belongs in a skill, not the Brain. The entire system — Brain, skills, rules, and all — lives in a single project directory. For more on why that works, see App-in-a-Folder: The Simplest Way to Build an AI System.
Layer 2: Skills — Context on Demand
TL;DR: Skills are context containers that load only when relevant. They use progressive disclosure — small frontmatter loads first, full content loads only when matched.
Skills hold specialized knowledge the system needs sometimes but not always. The Airtable schema. Content type specifications. Service connection details. Brand context loading instructions.
The key pattern is progressive disclosure. Each skill has two parts:
- Frontmatter (~100 tokens) — name, description, when to load
- Body (hundreds to thousands of tokens) — the actual knowledge
The AI reads all frontmatter at session start. When a task matches a skill’s description, the full body loads. A /write command triggers the content-types skill. A /plan command triggers the airtable-schema skill. A /status command triggers neither.
.claude/skills/
├── airtable-schema/ # Table IDs, field definitions, relationships
│ ├── SKILL.md # Frontmatter + full schema
│ └── references/ # Deep reference files
├── brand-context/ # Brand loading instructions
├── content-types/ # Specs for each content format
└── services-context/ # MCP server details, API parameters
Skills can also reference external files. The brand-context skill doesn’t contain the brand profile — it tells the AI to read brand/profile.md and the relevant platform voice guide. This keeps the skill itself small while pointing to rich context that lives elsewhere in the project.
The discipline is simple: if the information is needed every session, it goes in the Brain. If it’s needed for specific tasks, it goes in a skill. This distinction keeps context budgets manageable as the system grows.
Layer 3: Learnings — The Feedback Loop
TL;DR: Each command has its own learnings file. The command reads its learnings before running, so past feedback shapes future behavior without changing the model.
Learnings are the simplest layer and arguably the most important. Each command gets a file at learnings/. The file has three sections: What Works, What Doesn’t Work, and Do Differently.
learnings/
├── plan.md # Content strategy learnings
├── write.md # Content production learnings
├── status.md # Pipeline status learnings
├── publish.md # Publishing learnings
└── signals.md # DailySignals learnings
Here’s a real entry from learnings/write.md:
> First pillar page written — “AIOS: What an AI Operating System Is and How to Build One” (~3,800 words). Approved as-is on first review. Full brand stack loaded: profile + website voice guide + website playbook.
And another:
> Never include --- (horizontal rules) in content. Frontend CSS handles section spacing. Horizontal rules create visual noise.
The first entry reinforces a pattern that worked. The second prevents a mistake from repeating. Both load before the next /write session runs.
This isn’t fine-tuning. The model doesn’t change. What changes are the instructions the model reads before acting. It’s the difference between training someone and giving them better notes.
The feedback loop closes through /wrap — a session wrap-up command that combines system self-diagnosis with user feedback to update the relevant learnings files. Over time, the learnings accumulate into a practical operations manual written by the system about itself.
Layer 4: Context — Session Continuity
TL;DR: Two files track what’s happening across sessions — active-work.md for current state and session-log.md for history. This lets the system pick up where it left off.
Without context, every session starts from zero. The AI doesn’t know what was published yesterday, what’s in review, or what the user decided to skip. Context fixes this with two files:
context/active-work.md — the current state. Pipeline snapshot, published content, platform status, next priorities. Updated at the end of each session by /status or /wrap.
context/session-log.md — the history. A brief record of each session: date, what happened, what changed. Not a transcript — a summary.
When /status runs at the start of a session, it reads active-work.md and reports the current state. The user sees immediately: 73 pieces across 3 plans, 6 in review, 2 approved, ready to publish. No re-explaining. No “where were we?”
Context files should be factual and compact. They’re loaded frequently, so bloated context files waste the same budget that the layer system is designed to protect.
Layer 5: Services — External Connections
TL;DR: MCP servers connect the AIOS to external platforms. Each server exposes tools the AI calls by name — no custom API code required.
The Services layer is where the AIOS touches the outside world. Each external platform gets an MCP server that wraps its API and exposes named tools. For the full deep dive on how this works, see MCP Servers: How AIOS Connects to External Services.
| Service | What the AI Can Do |
|---|---|
| Airtable | Read/write records, query pipeline status, manage content metadata |
| WordPress | Create posts and pages, upload media, update published content |
| Blotato | Schedule and publish to Twitter, LinkedIn, TikTok, YouTube |
| DataForSEO | Pull keyword data, SERP analysis, backlink profiles, AI visibility checks |
| HeyGen | Generate avatar videos from scripts, check render status, download files |
The AI doesn’t manage HTTP requests or parse JSON responses. It calls mcp__airtable__list_records with a filter formula and gets structured data back. It calls mcp__blotato__blotato_create_post with content and a platform account ID and the post gets scheduled.
MCP servers are interchangeable. Swap WordPress for Ghost — as long as the new MCP server exposes similar tools, the commands adapt. The AI references tool names, not implementation details.
This matters for resilience too. If a service is down or not configured, the rest of the system still works. The /status command checks service health and reports what’s available. Missing a service reduces capability but doesn’t break the system. This is a key difference from traditional automation, where a broken integration breaks the whole pipeline.
How the Layers Interact
TL;DR: Layers flow downward — Brain governs everything, Skills add depth, Learnings refine behavior, Context provides continuity, Services execute.
A concrete example: the user runs /write to produce a cluster article.
- Brain — the AI reads
CLAUDE.md, knows the command exists, loads the content creation rule - Skills —
content-typesandbrand-contextskills load, providing article specs and brand voice - Learnings —
learnings/write.mdloads, telling the AI to load the full brand stack and avoid horizontal rules - Context —
active-work.mdconfirms the article’s status and its parent pillar - Services — Airtable MCP fetches the article shell, then saves the finished draft
Each layer adds information without duplicating what the others provide. The Brain says “load brand before writing.” The Skills say how. The Learnings say what worked last time. The Context says what’s in the pipeline. The Services execute the read and write.
Claude Code is the orchestrator that ties this together — reading each layer, reasoning about the task, and calling the right tools in the right order.
That’s the architecture. Five layers, clear boundaries, minimal overlap. The system scales by adding files, not by adding complexity.
For the complete guide to AIOS — what it is, when you need one, and how to build yours — see AIOS: What an AI Operating System Is and How to Build One.
Written by Wayne Ergle