Karpathy's LLM Wiki and the Problem Nobody Solved for 80 Years

The Search That Found What I Never Wrote

I searched my notes for “token budget strategies across multi-agent pipelines.” I never wrote about that. Not a single note, not a heading, not a tag. But there it was — a wiki page with a summary, three cross-referenced concepts, and links to two source documents I’d ingested months ago. The page existed because an LLM had read my sources, noticed the pattern across them, and created the entry.

That felt new. Not the usual AI demo trick where the output is impressive until you check the details.

A week later, Andrej Karpathy posted a gist describing exactly this pattern. It crossed five thousand stars and thousands of forks within days, with a handful of independent implementations appearing almost immediately. He didn’t invent the idea. He named a thing that was already happening in the tooling underground. That’s why it spread so fast.

Three Layers, Three Operations, One Folder

The LLM Wiki is not a chatbot wrapper around your notes. It’s an architecture with three distinct layers.

Raw Sources are immutable. PDFs, articles, transcripts, bookmarks — whatever you feed in. You curate these. The LLM never modifies them.

Wiki Pages are LLM-maintained markdown. Entities, concepts, synthesis pages — generated from sources, updated when new sources arrive, interlinked automatically. This is the persistent, compounding artifact. Karpathy’s phrase, and it’s precise. Unlike a chat history that evaporates, the wiki grows. Each ingest makes every previous ingest more valuable because the LLM can connect new material to existing pages.

Schema is the governance layer. A file that defines page types, frontmatter fields, naming conventions, and allowed operations. If you’ve ever written a CLAUDE.md or an AGENTS.md, you’ve already built a proto-schema. The schema co-evolves with the wiki — you adjust it as you learn what page types you actually need.

Three operations run against this stack. Ingest takes a source and produces or updates wiki pages. Query answers questions using the wiki as context. Lint checks structural consistency — broken links, missing frontmatter, orphaned pages.

That’s the whole thing. A markdown folder, a schema, an LLM agent. The simplicity is the point. You don’t need a vector database, a graph layer, or a SaaS product. You need files.

Is it overkill for someone with 20 bookmarks and a good memory? Obviously. But the pattern scales in a way that human-maintained systems don’t. Ingest your tenth source and watch the wiki rewrite connections across pages you haven’t touched in weeks. That’s the compounding Karpathy is talking about. Each source doesn’t just add pages — it enriches existing ones. A traditional note-taking system degrades with scale. This one improves.

The Part Where You Outsource Your Thinking

Niklas Luhmann kept a Zettelkasten — a slip-box of index cards — for 40 years. 90,000 cards. He called it his “Kommunikationspartner,” a communication partner. Not a filing cabinet. A partner. The system talked back to him through unexpected juxtapositions. He’d follow a thread of numbered cards and stumble into a connection he hadn’t planned.

Luhmann published 70 books and over 400 academic papers. The productivity is staggering. And the standard interpretation is that the Zettelkasten was responsible — that the act of writing each card, choosing where to file it, deciding which cards to link, was itself a form of thinking.

This is true. But it conflates two kinds of cognitive work.

The first is organizational labor: filing, cross-referencing, indexing, maintaining consistency. This is bookkeeping. Important bookkeeping, but bookkeeping.

The second is compression labor: deciding what matters, noticing connections, forming synthesis — the act of taking a sprawling source and reducing it to its load-bearing ideas.

Luhmann’s index cards forced both simultaneously. You couldn’t file a card without deciding what it meant. You couldn’t link it without understanding how it related to existing cards. The organizational work and the compression work were fused.

But that fusion is a property of the medium — paper cards in wooden drawers — not a law of cognition. Luhmann didn’t need the filing to think. He needed the compression. The filing was the tax the medium imposed.

The wrong analogy here is arithmetic. “We delegated calculation to machines and our math skills atrophied.” Arithmetic is mechanical. Cognition isn’t.

The right analogy is language learning. Spell-checkers didn’t destroy anyone’s ability to write. You still internalize grammar, vocabulary, sentence structure. The spell-checker catches typos. But machine-translating every sentence you encounter — never reading the original, never wrestling with the foreign structure — means you never learn the language. You build a dependency instead of a capability.

The LLM Wiki sits on that boundary. If the LLM handles organizational labor while you do the compression — reading the generated pages, correcting them, synthesizing across them — you might understand more than if you’d spent that time filing. Clark and Chalmers argued in their 1998 paper “The Extended Mind” that tools which reliably store and retrieve information function as part of your cognitive system. Otto’s notebook, in their famous thought experiment, is part of Otto’s mind. The LLM Wiki is a better notebook.

But if you treat the wiki as a black box — ingest sources, query answers, never read the pages — you’re machine-translating. You’ve built a reference library, not knowledge.

Here’s the diagnostic: explain a topic from your wiki to a colleague without opening it. If you can, the understanding transferred. If you can’t, you know what you have.

The tension is permanent. The best I can offer is a practice: every week, I read five wiki pages I didn’t write. I correct, I argue with the summary, I delete sentences that sound right but aren’t. That’s compression labor. The LLM did the filing. I do the thinking. Whether that division holds at scale — with a thousand pages instead of a hundred — is an experiment I’m running on myself.

What I do know: the knowledge systems that survive are the ones where the human stays in the compression loop. The ones that fail are where the human stops reading what the machine produces.

Eighty Years of Failed Knowledge Systems

The timeline compresses into a single uncomfortable paragraph. Vannevar Bush imagined the Memex in 1945 — a desk-sized device with associative trails through microfilm. The trails were the breakthrough idea: knowledge linked by human association, not alphabetical order. It was never built. Luhmann started his Zettelkasten in 1951 and maintained it daily for four decades. It died with him — his students couldn’t use it because the organizational logic was in his head. Tiago Forte’s Building a Second Brain (2022) made personal knowledge management accessible but demanded weekly reviews and active maintenance. Most people stop after three months. Obsidian gave us digital Zettelkastens with backlinks, graph views, and plugins. The community calls it “note rot” — the slow decay of an unmaintained vault. My own wiki folder currently holds about fifty pages ingested from nine raw sources in a week. Small enough that rot hasn’t set in yet, big enough that I already catch myself skimming when I should be reading.

Every system in that line failed at the same point: maintenance. The cost of keeping knowledge current, cross-referenced, and consistent is brutal. Humans are bad at it. Not because we’re lazy. Because maintenance is organizational labor, and organizational labor doesn’t compound the way compression labor does. It’s a treadmill.

The LLM Wiki changes the cost curve. Not to zero — running an ingest costs $0.30-1.00 depending on source length (50-150k tokens per run). An actively maintained wiki runs $10-30 per month. That’s real money. But it’s money, not time. And time is the resource every previous system demanded in amounts that eventually broke compliance.

The shift matters because it changes who can sustain a knowledge system. Luhmann was a tenured professor with decades of daily practice. Forte’s system requires the discipline of a weekly review habit. Most people aren’t Luhmann. Most people don’t maintain weekly reviews. A system that converts maintenance from a time cost to a dollar cost doesn’t lower the bar — it changes the shape of the bar entirely. The constraint moves from “do you have the discipline?” to “do you have the judgment to review what the LLM produces?” That’s a different skill, and one most knowledge workers already have.

The broader pattern is already visible. Anyone maintaining a CLAUDE.md is hand-writing a proto-wiki page — context that persists across sessions. Anyone using skills files is encoding domain knowledge into structured artifacts. Anyone writing specs before code is doing compression labor upfront and storing the result. The LLM Wiki formalizes what’s emerging organically across every tool that maintains persistent AI configuration.

Where It Falls Apart

Five real problems, none of them solved.

Context window limits. A wiki with 500 pages doesn’t fit in any current context window. The solution is retrieval — search the wiki, load relevant pages, operate on those. Tools like qmd handle this. At that scale, the wiki doesn’t replace RAG — it becomes the curation layer that makes RAG actually work. The retrieval happens over compiled knowledge instead of raw fragments.

Hallucination comes in two flavors. Structural hallucination — a broken link, a malformed frontmatter field, a nonexistent tag — is catchable. Run the lint operation. Semantic hallucination — an LLM claims Source A says X when it actually says Y — is not catchable without human review. The lint operation handles the first kind. Nothing handles the second kind at scale.

Self-reinforcing loops. If the wiki becomes the sole input for future wiki generation, errors compound. Source A gets slightly misrepresented in Wiki Page B. Page B becomes context for generating Page C. Page C cites the misrepresentation as established fact. The fix is simple in principle — always regenerate from raw sources, not from wiki pages — but easy to violate in practice.

No provenance. Who wrote this page? When? From which source? Current implementations don’t track this well. Version control helps (it’s markdown, so git works), but “this sentence was derived from paragraph 3 of source X” isn’t something any implementation tracks automatically.

Concurrent sessions. Two agents ingesting simultaneously can produce conflicting updates. There’s no locking mechanism, no merge strategy, no conflict resolution. Git handles file-level conflicts. Semantic conflicts — two agents writing contradictory summaries of the same concept — are unsolved.

There’s also the question of taste. An LLM doesn’t know what you find interesting. It knows what’s statistically salient in the source material, which is a different thing. Your wiki will reliably capture the main arguments of a paper. It will miss the throwaway footnote that changes how you think about the problem. That footnote is still yours to catch.

These problems are solvable. They’re just not solved yet.

Minimum Viable Wiki A markdown folder. A schema file. An LLM agent.

Create wiki/ with subdirectories: sources/, entities/, concepts/, synthesis/

Write a schema that defines page types, frontmatter, and operations

Drop a source. Tell the agent to ingest it. Watch 5-15 pages appear.

Obsidian is optional but useful — graph view shows your wiki’s shape. qmd handles search when the index outgrows the context window.

The Desk That Organized Itself

That search result I found — the one I never wrote — is still in my wiki. I’ve read it three times since. Corrected two claims, added a paragraph, deleted a sentence that was confident about something the source was ambiguous about.

The organizational labor was done for me. The compression labor was still mine. The page is better for it — and so is my understanding of the topic.

Eighty years after Bush imagined a machine that would maintain associative trails through human knowledge, we have something close. It runs on plain text and CLI tools, not microfilm. The maintenance problem isn’t solved — it’s repriced. And repricing changes everything, right up until you discover what the new price actually buys you.

Whether that’s knowledge or just a very well-organized pile of text is a question you’ll have to answer by closing the wiki and seeing what you remember.

Karpathy's LLM Wiki and the Problem Nobody Solved for 80 Years

The Search That Found What I Never Wrote#

Three Layers, Three Operations, One Folder#

The Part Where You Outsource Your Thinking#

Eighty Years of Failed Knowledge Systems#

Where It Falls Apart#

The Desk That Organized Itself#