Design Principles

How I do AI

I keep watching smart people build elaborate agent systems that collapse the second they touch real work. A multi-step planner. A vector store. A retrieval-augmented synthesizer. An orchestrator. A memory layer. Eight microservices behind a custom protocol. By month three they're back to pasting things into ChatGPT.

The bug is not the LLM. The bug is the architecture they wrapped around it.

This repository is the architecture I actually use. The pieces are boring on purpose. But they work on a Tuesday, they work on a flight, and they work six months from now when I've forgotten everything about how I built them. That is the bar.

The Thesis

Small, opinionated, composable pieces beat one big agent. Always. Every time.

To keep operations light, we run three always-on agents (such as the Beast agent running on the Hermes VPS). These agents do not run monolithic planning loops; they use OpenClaw or Hermes strictly as a thin harness/message-transport layer. For writing code, modifying files, and refactoring scripts, we pair-program directly with Codex Gemini. This separates the always-on monitoring loop from active code authoring.

A big agent — any single program that tries to plan, retrieve, decide, write, and ship — is a load-bearing wall in a building you are still designing. You cannot move it without bringing the roof down. You cannot see inside it. You cannot reason about why it did what it did three weeks ago.

A bunch of small scripts that each do one thing, log what they did, and hand off via plain files? You can move those. You can read them. You can rebuild them in an afternoon if you have to.

Principles I actually follow

Principle 1

Memory belongs in plain text, not in your context window

Real memory lives in files you can grep. For me, that is events.db (one SQLite events table) and ~/.claude/projects/<project>/memory/MEMORY.md. If I delete my context window, the brain loses nothing.

Context windows are not memory. They are a scratchpad. They evaporate when the session ends, get compacted, or lie to you.

Principle 2

One repository per fully-featured thing

Keep tools modular. If a stranger can clone it, run it, and get value in under an hour without the rest of my setup, it earns its own repository. Hub and spokes linked by references.

Monolithic conglomerates get messy fast. When the marketing harness needs to ship, the brain refactor blocks it.

Principle 3

Skills, hooks, and slash commands beat massive system prompts

Claude Code primitives: skills (markdown loaded on demand), hooks (deterministic pre/post commands), and slash commands (named entry points). Modular and debuggable.

Long system prompts are monolithic load-bearing walls. Every new behavior fights every other behavior, making debugging behavior failure impossible.

Principle 4

Your Claude finishes the setup

I ship structured markdown prompts. You paste one into your Claude. Your agent reads the files, asks what you are trying to do, and handles the local environment setup.

Fat installers and automation scripts break across environments in silent, un-reproducible ways.

Principle 5

The brain ingests the world; agents act on what it noticed

Decouple ingest from execution. Build background scoring scripts (like daily paper filters) that drop data into the SQLite database. The runtime agent just reads the filtered records.

Single agents watching everything and deciding on the fly. They starve from lack of feed data or bloat the context window.

Principle 6

Subagents for parallel work, not for linear tasks

Use parallel subagents for isolated jobs (like generating three different layout drafts). If two tasks can run on different days without shared state, they are subagent-ready.

Splitting linear step-by-step logic across subagents. They lose track of each other, forcing you to spend more time stitching than coding.