This library entry is part of The Extended Frontier thesis. Entries are curated with AI assistance and human review; most initial entries were prepared with Claude (Anthropic), while individual entries may note other assisting systems. Metadata and annotations are editorial, not peer-reviewed. Entries flagged as unverified may contain placeholder dates, authors, or classifications.

Thin Harness, Fat Skills

Garry Tan··doc·source
The 2x people and the 100x people are using the same models. The difference is five concepts that fit on an index card.

Short, practitioner-facing ethos doc arguing that the durable leverage in agent systems comes from model-resident skills (markdown) and deterministic code at the edges, with the harness kept as thin as possible so each model upgrade flows through.

Classification

Role
practitioner-note
Domain
software
Source type
doc
Harness types
execution-harnessinterface-harnesslearning-harness
Validation position
before-generation
Validation mode
empirical
Prescription stance
anti-prescriptive
Relation to argument
capability-is-extendeddiffusion-adoption-bottleneckfirst-mile-input-formation
Tags
thin-harnessskillsscaffolding-skepticismagent-designmarkdown-skills

Extended capability commentary

Input legibility
Assumes inputs are legible enough that heavy shaping is unnecessary — a domain-specific bet.
Task structure
Reward richness
Does not foreground reward signal as the key lever.
Repairability
Thin-harness framing tends to under-specify where repair loops live.
Observability
Institutional ratification

Why it matters

A counterweight to harness-heavy framings. Tracks the prediction that as models get better, elaborate scaffolding becomes dead weight. Useful to read alongside Miessler (harness-engineering) and HumanLayer (sub-agents-as-context-control).

Annotation

A compact practitioner thesis from the gbrain repo: the productivity gap between 2x and 100x agentic-engineering users is not the model, it is the architectural pattern around the model. The prescription is architectural restraint — push fuzzy operations into markdown skills, push must-be-perfect operations into code, and keep the harness thin so every model improvement flows through automatically.

Companion tweet (same framing, compressed): @garrytan, "Thin harness, fat skills".

This is a sharp disagreement with framings that treat validation, repair, and context routing as constitutive of capability. In Tan's picture, most of that work is either absorbed by the next model or revealed as compensation for a weaker one. In the constitutive picture (Wallach/Jacobs et al.; Salaudeen et al.), those loops are where capability lives in practice — no matter how strong the base model.

Keep this entry visible when reading sources that argue the opposite. It marks the pole the library should preserve, not flatten.

Open questions

  • Under what domain conditions is "thin harness" actually enough? (Hypothesis: high offline evaluability, low institutional ratification cost.)
  • Does "fat skills" degrade gracefully when inputs are illegible or reward signal is thin?
  • What's the smallest counterexample — a task where a fat harness around a weaker model beats a thin harness around a stronger one and continues to beat it as models improve?

Related entries

Overlap is computed on tags, relation-to-argument, and harness types — not on role or domain, because contrasts are often the most useful neighbours.