This library entry is part of The Extended Frontier thesis. Entries are curated with AI assistance and human review; most initial entries were prepared with Claude (Anthropic), while individual entries may note other assisting systems. Metadata and annotations are editorial, not peer-reviewed. Entries flagged as unverified may contain placeholder dates, authors, or classifications.

LLM Knowledge Bases

Andrej Karpathy··tweet·source
Metadata unverified. Author, URL, timestamp, and content came from user capture. Confirm directly in X before formal citation.
You rarely ever write or edit the wiki manually, it's the domain of the LLM.

Describes a personal research workflow where raw source documents are compiled by an LLM into a markdown wiki, maintained through index files, health checks, generated outputs, and lightweight tools rather than a heavyweight RAG stack.

Classification

Role
practitioner-note
Domain
research
Source type
tweet
Harness types
grounding-context-loadingexecution-harnessvalidation-harnessrepair-harnessmonitoring-harnesslearning-harnessinterface-harness
Validation position
before-generationimmediately-after-generationcontinuous
Validation mode
empiricalinterpretive
Prescription stance
mixed
Relation to argument
capability-is-extendedfirst-mile-input-formationvalidation-is-constitutiverepairability-mattersobservability-mattersdomain-structure-matters
Tags
knowledge-basemarkdown-wikiobsidianagentic-researchllm-maintained-artifactspersonal-knowledge-management

Extended capability commentary

Input legibility
The raw/ to wiki compilation process is explicitly about making heterogeneous documents legible to future LLM turns.
Task structure
Markdown files, indexes, backlinks, summaries, and Obsidian views give the work a manipulable structure.
Reward richness
The workflow has useful signals from links, consistency, and answer quality, but not an explicit reward signal.
Feedback latency
Feedback arrives through Q&A, rendered outputs, and health checks, but not usually as immediate pass/fail tests.
Repairability
Health checks, missing-data imputation, and filing outputs back into the wiki make the knowledge base incrementally repairable.
Observability
The wiki is human-readable markdown and images viewed in Obsidian, so the agent's knowledge substrate stays inspectable.
Reversibility
Markdown artifacts are versionable, though the post does not foreground git or rollback.
Offline evaluability
Some checks can be run offline over the wiki, but factual gaps still require web search or source refresh.
Institutional ratification
This is a personal research workflow rather than an organizational ratification system.

Why it matters

This is the Extended Frontier applied to knowledge work: the model's capability comes from a maintained corpus, indexes, summaries, visual outputs, and health checks that make research cumulative instead of ephemeral.

Annotation

Karpathy describes a knowledge-work harness, not just a note-taking habit. Raw sources go into one directory; an LLM incrementally compiles them into a markdown wiki with summaries, backlinks, concept pages, index files, and derived visualizations. Obsidian becomes the human-facing IDE, while the LLM owns most direct edits to the wiki.

The important move is that research outputs are not terminal chat answers. They become files: markdown notes, Marp slides, matplotlib images, search indexes, and follow-up articles that can be filed back into the corpus. Each query can make the next query easier because the knowledge base itself accumulates structure.

For the library, this is a clean example of capability as artifact maintenance. Karpathy expected to need "fancy RAG," but at roughly 100 articles and 400K words, LLM-maintained summaries and index files were enough. The boundary condition matters: the system works because the scale is still small enough for source-aware traversal and because the artifacts are legible.

Extended Frontier Read

The raw model is not the unit of analysis. The useful system is model plus:

  • a raw source archive,
  • a compiled markdown wiki,
  • index and summary files,
  • Obsidian as inspection surface,
  • generated outputs that feed back into the wiki,
  • health checks over consistency and missing data,
  • small custom tools such as a wiki search engine.

This belongs beside harness entries, but it broadens the frame from coding agents to research agents. The same pattern appears: make the environment legible, let the model act on files, inspect the result, repair the substrate, and let work accumulate.

Open Questions

  • At what corpus size does this stop working without stronger retrieval infrastructure?
  • Which health checks are most predictive of useful future Q&A?
  • Does finetuning on the wiki improve capability, or does it destroy the inspectability and repairability that make the workflow valuable?

Notes

Source text supplied by Daniel from X. This entry was prepared with Codex (OpenAI).

Related entries

  • What Is an Agent Harness
    Aparna Dhinakaran · 2026-04-21
    capability-is-extendedvalidation-is-constitutiverepairability-mattersobservability-mattersgrounding-context-loadingexecution-harnessvalidation-harnessrepair-harnessmonitoring-harnesslearning-harnessinterface-harness
  • An open-source spec for Codex orchestration: Symphony
    Alex Kotliarskyi, Victor Zhu, and Zach Brock · 2026-04-26
    capability-is-extendedvalidation-is-constitutiverepairability-mattersobservability-mattersexecution-harnessvalidation-harnessrepair-harnessmonitoring-harnesslearning-harnessinterface-harness
  • Hermes Agent README
    Nous Research · 2026-04-28
    capability-is-extendedfirst-mile-input-formationrepairability-mattersobservability-mattersgrounding-context-loadingexecution-harnessrepair-harnessmonitoring-harnesslearning-harnessinterface-harness
  • Good and Bad Harness Engineering
    Daniel Miessler · 2025-08-31
    capability-is-extendedrepairability-mattersobservability-mattersgrounding-context-loadingexecution-harnessvalidation-harnessrepair-harnessmonitoring-harness

Overlap is computed on tags, relation-to-argument, and harness types — not on role or domain, because contrasts are often the most useful neighbours.