Karpathy's LLM Wiki — A Better Pattern Than RAG? 🧠

April 9, 2026 · 10 min read

Frontend Architect

Andrej Karpathy — the same person who coined vibe coding — recently published a gist describing a pattern he calls the "LLM Wiki". The core tension it addresses is simple: RAG re-derives knowledge from scratch on every question. What if the LLM built something persistent instead — a structured, interlinked wiki that compounds over time? It's a deceptively simple idea with some genuinely interesting implications.

What Is the LLM Wiki Pattern?

The Problem with RAG

The standard RAG (Retrieval-Augmented Generation) workflow is familiar: upload documents, chunk them, embed them into a vector store, retrieve relevant chunks at query time, and generate an answer. NotebookLM, ChatGPT file uploads, and most enterprise RAG pipelines work this way.

It works. But Karpathy identifies a key limitation: the LLM is "rediscovering knowledge from scratch on every question." Ask a subtle question that requires synthesizing five documents and the model has to find and piece together the relevant fragments every single time. Nothing accumulates. There's no memory of the synthesis it already did.

To be fair, RAG is perfectly adequate for many use cases — searching large, mostly-static document sets, answering factual questions with source citations, powering customer-facing Q&A. The issue isn't that RAG is bad. It's that it doesn't build anything.

The Wiki as a Compiled Artifact

The LLM Wiki pattern flips the workflow. Instead of retrieving raw chunks at query time, the LLM incrementally builds and maintains a persistent wiki — a structured collection of interlinked markdown files that sits between you and the raw sources.

When you add a new source, the LLM doesn't just index it. It reads the document, extracts key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and kept current, not re-derived on every query.

Karpathy frames it neatly: "Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase."

The key difference from RAG is that the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. Every source you add makes the whole structure richer.

The Three-Layer Architecture

The pattern has a clean three-layer architecture:

Layer	What it is	Who owns it
Raw sources	Curated documents, articles, data files — immutable	You
The wiki	LLM-generated markdown — summaries, entity pages, cross-references	The LLM
The schema	Instruction file (e.g. `CLAUDE.md`, `AGENTS.md`) defining structure and workflows	You + LLM

The schema layer is the most interesting from a practical standpoint. It's an instruction file that tells the LLM how the wiki should be structured: what page types exist, how to name files, when to cross-reference, how to handle contradictions. If you've used CLAUDE.md or .github/copilot-instructions.md to give an agent project context, you already know this pattern — it maps directly to workspace instruction files covered in our Prompt Engineering docs.

Operations

The pattern defines three core operations:

Ingest — Add a new source document. The LLM reads it, extracts key information, and integrates it into the existing wiki: creating new pages, updating existing ones, logging the addition.
Query — Ask questions against the compiled wiki. Because the knowledge is already synthesized, the LLM reads pre-built pages rather than searching raw chunks.
Lint — A periodic health-check. The LLM scans the wiki for contradictions, stale information, orphan pages, and structural inconsistencies. Think of it as running a linter on your knowledge base.

note

Karpathy's gist is intentionally abstract — it describes the pattern and the thinking behind it, not a specific implementation. The "Getting Started" section below offers a practical starting point, but expect to iterate on the schema and workflow for your own use case.

Why This Is Interesting

Beyond the mechanics, there are a few reasons this pattern is worth thinking about.

Knowledge compounds. The analogy to compiled vs. interpreted code is apt. RAG "interprets" knowledge at query time — it works, but you're paying the synthesis cost on every question. The LLM Wiki "compiles" knowledge at ingest time. The upfront cost is higher, but queries against pre-synthesized material are faster and richer.

It plays to LLM strengths. LLMs are excellent at summarizing, paraphrasing, and cross-referencing — the exact operations the wiki requires. They're weaker at precise retrieval from large corpora, which is exactly what RAG demands. The LLM Wiki moves the workload from a weakness to a strength.

It's an agentic workflow. Ingest, query, and lint are agent tasks. The LLM reads files, decides what to update, writes markdown, and maintains structural integrity across the wiki. This maps directly to the agent patterns covered in our AI Coding Agents docs — the wiki is just a different substrate than source code.

Historical resonance. Karpathy references Vannevar Bush's Memex from 1945 — a hypothetical device for storing and cross-referencing all the documents a person reads. Eighty years later, the combination of LLMs and markdown gets us surprisingly close to what Bush described.

Pros and Cons

What Works Well

Eliminates repetitive synthesis. The LLM does the hard work of connecting ideas once, not on every query.
Progressive enrichment. Each new source improves the whole wiki, not just its own entry. Cross-references emerge organically.
Human-readable output. It's plain markdown. You can browse it in Obsidian, VS Code, or GitHub. No opaque vector store — everything is visible.
Transparent reasoning. The wiki structure itself is an artifact you can audit. You can see how the LLM organized and connected information, not just the final answer.
Cross-referencing is a superpower. LLMs are genuinely good at spotting connections across documents — the kind of tedious linking work humans rarely bother to do manually.

Challenges and Limitations

Hallucination risk during integration. This is the biggest concern. When an LLM hallucinates during RAG, the error is ephemeral — it appears in one answer and disappears. When an LLM hallucinates during wiki integration, the error gets baked in. It becomes part of the knowledge base and may influence future synthesis.

caution

Hallucinations in the wiki are persistent, not ephemeral. A factual error introduced during ingest can propagate through cross-references and compound over time. Periodic review of LLM-generated wiki pages is essential — treat them like code that needs review, not ground truth.

Cost and latency on ingest. Processing a source through the wiki is more expensive than chunking and embedding it. The LLM needs to read the full document, understand the existing wiki, and write updates. For large sources, this can be slow and token-heavy.
Schema design is non-trivial. The quality of the wiki depends heavily on the instruction file. A vague schema produces a messy wiki. Getting the page types, naming conventions, and cross-reference rules right takes iteration.
Scaling questions. What happens at thousands of pages? The LLM's context window becomes a bottleneck — it can't read the entire wiki to decide where new information belongs. Ironically, at scale, you might need RAG on top of the wiki.
Source attribution. Tracing a specific claim in the wiki back to its original source is harder than in RAG, where retrieved chunks carry direct provenance. The wiki's synthesis step obscures the connection.

Practical Applications

Personal Knowledge Management

This is the most natural starting point. If you're actively reading, taking course notes, or following a topic over time, the LLM Wiki gives you a way to build a cumulative knowledge base without doing the filing yourself. Karpathy's example of building a "fan wiki while reading a book" — with pages for characters, themes, and plot threads — is a good illustration of the pattern at a small scale.

The tool stack is straightforward: an LLM agent (Claude Code, OpenAI Codex, OpenCode, or similar), a folder of markdown files, and Obsidian for browsing.

Team and Project Knowledge Bases

This is the application most relevant to our team. Imagine an internal wiki that's continuously fed by Slack threads, meeting transcripts, and project documents — where the LLM maintains topic pages, decision logs, and cross-references automatically. As Karpathy puts it: "The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping." That bookkeeping — the filing, cross-referencing, and structural maintenance that makes internal wikis decay — would be handled by the agent.

Could this approach work for our own docs? It's speculative, but worth experimenting with — especially for knowledge that's currently scattered across Slack threads and Google Docs.

Research and Due Diligence

For deep research over weeks or months — competitive analysis, technology evaluation, due diligence — the LLM Wiki offers an "evolving thesis" that gets sharper with every source you add. Instead of a pile of bookmarks and scattered notes, you get a structured synthesis that reflects everything you've read so far.

Getting Started

If you want to try the pattern, here's a minimal starting point:

Pick a topic you're actively accumulating knowledge about
Choose your LLM agent (Claude Code, OpenAI Codex, or similar)
Create a folder structure: sources/, wiki/, and a schema file
Write a minimal schema: page types, naming conventions, cross-reference style
Drop in your first few sources and ask the agent to ingest them
Open the wiki in Obsidian and iterate on the structure
Read the full gist for detailed guidance

Here's what the folder structure looks like:

Folder structure
my-wiki/
├── sources/          # Raw, immutable source documents
│   ├── paper-1.pdf
│   ├── article-2.md
│   └── notes-3.txt
├── wiki/             # LLM-generated markdown pages
│   ├── index.md
│   ├── log.md
│   ├── overview.md
│   └── topics/
│       ├── concept-a.md
│       └── entity-b.md
└── CLAUDE.md          # Schema / instruction file

RAG vs. LLM Wiki — When to Use Which

Here's a side-by-side comparison to help you think about which approach fits your situation:

Dimension	RAG	LLM Wiki
Knowledge persistence	None — re-derived per query	Persistent, compounding
Ingest cost	Low (chunk + embed)	High (read + synthesize + write)
Query speed	Fast retrieval + generation	Fast (read pre-compiled pages)
Synthesis quality	Depends on retrieval quality	Pre-synthesized; richer over time
Hallucination risk	Per-query (ephemeral)	Baked into wiki (persistent)
Scale	Handles large corpora well	Context window limits on ingest
Transparency	Opaque retrieval decisions	Visible, browsable wiki
Best for	Large, rarely-updated doc sets	Evolving, deeply-connected topics

The key takeaway: they're complementary, not competing. RAG is the right choice for large, mostly-static document collections where you need fast lookup with source citations. The LLM Wiki is better suited for actively building understanding over time — where you want the knowledge to compound rather than be re-derived.

At scale, you likely need both. A large wiki itself becomes a corpus that benefits from RAG-style retrieval to help the LLM navigate it efficiently.

Aliz Stack Connection

This pattern connects directly to our AI docs: the schema layer is a workspace instruction file, the operations are agent tasks, and the same review discipline that applies to LLM-generated code applies to LLM-generated wiki pages. If you've read our AI-Assisted Development section, the concepts will be familiar — the wiki is just a different substrate.

tip

If you want to try this pattern, start with our Prompt Engineering docs for writing effective instruction files — the schema layer is where most of the quality comes from.

What Is the LLM Wiki Pattern?​

The Problem with RAG​

The Wiki as a Compiled Artifact​

The Three-Layer Architecture​

Operations​

Why This Is Interesting​

Pros and Cons​

What Works Well​

Challenges and Limitations​

Practical Applications​

Personal Knowledge Management​

Team and Project Knowledge Bases​

Research and Due Diligence​

Getting Started​

RAG vs. LLM Wiki — When to Use Which​

Aliz Stack Connection​

Further Reading​