Skip to main content

Karpathy's LLM Wiki โ€” A Better Pattern Than RAG? ๐Ÿง 

ยท 10 min read
Gergely Sipos
Frontend Architect

Andrej Karpathy โ€” the same person who coined vibe coding โ€” recently published a gist describing a pattern he calls the "LLM Wiki". The core tension it addresses is simple: RAG re-derives knowledge from scratch on every question. What if the LLM built something persistent instead โ€” a structured, interlinked wiki that compounds over time? It's a deceptively simple idea with some genuinely interesting implications.

What Is the LLM Wiki Pattern?โ€‹

The Problem with RAGโ€‹

The standard RAG (Retrieval-Augmented Generation) workflow is familiar: upload documents, chunk them, embed them into a vector store, retrieve relevant chunks at query time, and generate an answer. NotebookLM, ChatGPT file uploads, and most enterprise RAG pipelines work this way.

It works. But Karpathy identifies a key limitation: the LLM is "rediscovering knowledge from scratch on every question." Ask a subtle question that requires synthesizing five documents and the model has to find and piece together the relevant fragments every single time. Nothing accumulates. There's no memory of the synthesis it already did.

To be fair, RAG is perfectly adequate for many use cases โ€” searching large, mostly-static document sets, answering factual questions with source citations, powering customer-facing Q&A. The issue isn't that RAG is bad. It's that it doesn't build anything.

The Wiki as a Compiled Artifactโ€‹

The LLM Wiki pattern flips the workflow. Instead of retrieving raw chunks at query time, the LLM incrementally builds and maintains a persistent wiki โ€” a structured collection of interlinked markdown files that sits between you and the raw sources.

When you add a new source, the LLM doesn't just index it. It reads the document, extracts key information, and integrates it into the existing wiki โ€” updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and kept current, not re-derived on every query.

Karpathy frames it neatly: "Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase."

The key difference from RAG is that the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. Every source you add makes the whole structure richer.

The Three-Layer Architectureโ€‹

The pattern has a clean three-layer architecture:

LayerWhat it isWho owns it
Raw sourcesCurated documents, articles, data files โ€” immutableYou
The wikiLLM-generated markdown โ€” summaries, entity pages, cross-referencesThe LLM
The schemaInstruction file (e.g. CLAUDE.md, AGENTS.md) defining structure and workflowsYou + LLM

The schema layer is the most interesting from a practical standpoint. It's an instruction file that tells the LLM how the wiki should be structured: what page types exist, how to name files, when to cross-reference, how to handle contradictions. If you've used CLAUDE.md or .github/copilot-instructions.md to give an agent project context, you already know this pattern โ€” it maps directly to workspace instruction files covered in our Prompt Engineering docs.

Operationsโ€‹

The pattern defines three core operations:

  • Ingest โ€” Add a new source document. The LLM reads it, extracts key information, and integrates it into the existing wiki: creating new pages, updating existing ones, logging the addition.
  • Query โ€” Ask questions against the compiled wiki. Because the knowledge is already synthesized, the LLM reads pre-built pages rather than searching raw chunks.
  • Lint โ€” A periodic health-check. The LLM scans the wiki for contradictions, stale information, orphan pages, and structural inconsistencies. Think of it as running a linter on your knowledge base.
note

Karpathy's gist is intentionally abstract โ€” it describes the pattern and the thinking behind it, not a specific implementation. The "Getting Started" section below offers a practical starting point, but expect to iterate on the schema and workflow for your own use case.

Why This Is Interestingโ€‹

Beyond the mechanics, there are a few reasons this pattern is worth thinking about.

Knowledge compounds. The analogy to compiled vs. interpreted code is apt. RAG "interprets" knowledge at query time โ€” it works, but you're paying the synthesis cost on every question. The LLM Wiki "compiles" knowledge at ingest time. The upfront cost is higher, but queries against pre-synthesized material are faster and richer.

It plays to LLM strengths. LLMs are excellent at summarizing, paraphrasing, and cross-referencing โ€” the exact operations the wiki requires. They're weaker at precise retrieval from large corpora, which is exactly what RAG demands. The LLM Wiki moves the workload from a weakness to a strength.

It's an agentic workflow. Ingest, query, and lint are agent tasks. The LLM reads files, decides what to update, writes markdown, and maintains structural integrity across the wiki. This maps directly to the agent patterns covered in our AI Coding Agents docs โ€” the wiki is just a different substrate than source code.

Historical resonance. Karpathy references Vannevar Bush's Memex from 1945 โ€” a hypothetical device for storing and cross-referencing all the documents a person reads. Eighty years later, the combination of LLMs and markdown gets us surprisingly close to what Bush described.

Pros and Consโ€‹

What Works Wellโ€‹

  • Eliminates repetitive synthesis. The LLM does the hard work of connecting ideas once, not on every query.
  • Progressive enrichment. Each new source improves the whole wiki, not just its own entry. Cross-references emerge organically.
  • Human-readable output. It's plain markdown. You can browse it in Obsidian, VS Code, or GitHub. No opaque vector store โ€” everything is visible.
  • Transparent reasoning. The wiki structure itself is an artifact you can audit. You can see how the LLM organized and connected information, not just the final answer.
  • Cross-referencing is a superpower. LLMs are genuinely good at spotting connections across documents โ€” the kind of tedious linking work humans rarely bother to do manually.

Challenges and Limitationsโ€‹

  • Hallucination risk during integration. This is the biggest concern. When an LLM hallucinates during RAG, the error is ephemeral โ€” it appears in one answer and disappears. When an LLM hallucinates during wiki integration, the error gets baked in. It becomes part of the knowledge base and may influence future synthesis.
caution

Hallucinations in the wiki are persistent, not ephemeral. A factual error introduced during ingest can propagate through cross-references and compound over time. Periodic review of LLM-generated wiki pages is essential โ€” treat them like code that needs review, not ground truth.

  • Cost and latency on ingest. Processing a source through the wiki is more expensive than chunking and embedding it. The LLM needs to read the full document, understand the existing wiki, and write updates. For large sources, this can be slow and token-heavy.
  • Schema design is non-trivial. The quality of the wiki depends heavily on the instruction file. A vague schema produces a messy wiki. Getting the page types, naming conventions, and cross-reference rules right takes iteration.
  • Scaling questions. What happens at thousands of pages? The LLM's context window becomes a bottleneck โ€” it can't read the entire wiki to decide where new information belongs. Ironically, at scale, you might need RAG on top of the wiki.
  • Source attribution. Tracing a specific claim in the wiki back to its original source is harder than in RAG, where retrieved chunks carry direct provenance. The wiki's synthesis step obscures the connection.

Practical Applicationsโ€‹

Personal Knowledge Managementโ€‹

This is the most natural starting point. If you're actively reading, taking course notes, or following a topic over time, the LLM Wiki gives you a way to build a cumulative knowledge base without doing the filing yourself. Karpathy's example of building a "fan wiki while reading a book" โ€” with pages for characters, themes, and plot threads โ€” is a good illustration of the pattern at a small scale.

The tool stack is straightforward: an LLM agent (Claude Code, OpenAI Codex, OpenCode, or similar), a folder of markdown files, and Obsidian for browsing.

Team and Project Knowledge Basesโ€‹

This is the application most relevant to our team. Imagine an internal wiki that's continuously fed by Slack threads, meeting transcripts, and project documents โ€” where the LLM maintains topic pages, decision logs, and cross-references automatically. As Karpathy puts it: "The tedious part of maintaining a knowledge base is not the reading or the thinking โ€” it's the bookkeeping." That bookkeeping โ€” the filing, cross-referencing, and structural maintenance that makes internal wikis decay โ€” would be handled by the agent.

Could this approach work for our own docs? It's speculative, but worth experimenting with โ€” especially for knowledge that's currently scattered across Slack threads and Google Docs.

Research and Due Diligenceโ€‹

For deep research over weeks or months โ€” competitive analysis, technology evaluation, due diligence โ€” the LLM Wiki offers an "evolving thesis" that gets sharper with every source you add. Instead of a pile of bookmarks and scattered notes, you get a structured synthesis that reflects everything you've read so far.

Getting Startedโ€‹

If you want to try the pattern, here's a minimal starting point:

  1. Pick a topic you're actively accumulating knowledge about
  2. Choose your LLM agent (Claude Code, OpenAI Codex, or similar)
  3. Create a folder structure: sources/, wiki/, and a schema file
  4. Write a minimal schema: page types, naming conventions, cross-reference style
  5. Drop in your first few sources and ask the agent to ingest them
  6. Open the wiki in Obsidian and iterate on the structure
  7. Read the full gist for detailed guidance

Here's what the folder structure looks like:

Folder structure
my-wiki/
โ”œโ”€โ”€ sources/ # Raw, immutable source documents
โ”‚ โ”œโ”€โ”€ paper-1.pdf
โ”‚ โ”œโ”€โ”€ article-2.md
โ”‚ โ””โ”€โ”€ notes-3.txt
โ”œโ”€โ”€ wiki/ # LLM-generated markdown pages
โ”‚ โ”œโ”€โ”€ index.md
โ”‚ โ”œโ”€โ”€ log.md
โ”‚ โ”œโ”€โ”€ overview.md
โ”‚ โ””โ”€โ”€ topics/
โ”‚ โ”œโ”€โ”€ concept-a.md
โ”‚ โ””โ”€โ”€ entity-b.md
โ””โ”€โ”€ CLAUDE.md # Schema / instruction file

RAG vs. LLM Wiki โ€” When to Use Whichโ€‹

Here's a side-by-side comparison to help you think about which approach fits your situation:

DimensionRAGLLM Wiki
Knowledge persistenceNone โ€” re-derived per queryPersistent, compounding
Ingest costLow (chunk + embed)High (read + synthesize + write)
Query speedFast retrieval + generationFast (read pre-compiled pages)
Synthesis qualityDepends on retrieval qualityPre-synthesized; richer over time
Hallucination riskPer-query (ephemeral)Baked into wiki (persistent)
ScaleHandles large corpora wellContext window limits on ingest
TransparencyOpaque retrieval decisionsVisible, browsable wiki
Best forLarge, rarely-updated doc setsEvolving, deeply-connected topics

The key takeaway: they're complementary, not competing. RAG is the right choice for large, mostly-static document collections where you need fast lookup with source citations. The LLM Wiki is better suited for actively building understanding over time โ€” where you want the knowledge to compound rather than be re-derived.

At scale, you likely need both. A large wiki itself becomes a corpus that benefits from RAG-style retrieval to help the LLM navigate it efficiently.

Aliz Stack Connectionโ€‹

This pattern connects directly to our AI docs: the schema layer is a workspace instruction file, the operations are agent tasks, and the same review discipline that applies to LLM-generated code applies to LLM-generated wiki pages. If you've read our AI-Assisted Development section, the concepts will be familiar โ€” the wiki is just a different substrate.

tip

If you want to try this pattern, start with our Prompt Engineering docs for writing effective instruction files โ€” the schema layer is where most of the quality comes from.

Further Readingโ€‹