Thin Harness, Fat Skills
The 2x people and the 100x people are using the same models. The difference is five concepts that fit on an index card.
Short, practitioner-facing ethos doc arguing that the durable leverage in agent systems comes from model-resident skills (markdown) and deterministic code at the edges, with the harness kept as thin as possible so each model upgrade flows through.
Classification
- Role
- practitioner-note
- Domain
- software
- Source type
- doc
- Harness types
- execution-harnessinterface-harnesslearning-harness
- Validation position
- before-generation
- Validation mode
- empirical
- Prescription stance
- anti-prescriptive
- Relation to argument
- capability-is-extendeddiffusion-adoption-bottleneckfirst-mile-input-formation
- Tags
- thin-harnessskillsscaffolding-skepticismagent-designmarkdown-skills
Extended capability commentary
- Input legibility
- Assumes inputs are legible enough that heavy shaping is unnecessary — a domain-specific bet.
- Task structure
- Reward richness
- Does not foreground reward signal as the key lever.
- Repairability
- Thin-harness framing tends to under-specify where repair loops live.
- Observability
- Institutional ratification
Why it matters
A counterweight to harness-heavy framings. Tracks the prediction that as models get better, elaborate scaffolding becomes dead weight. Useful to read alongside Miessler (harness-engineering) and HumanLayer (sub-agents-as-context-control).
Annotation
A compact practitioner thesis from the gbrain repo: the productivity gap between 2x and 100x agentic-engineering users is not the model, it is the architectural pattern around the model. The prescription is architectural restraint — push fuzzy operations into markdown skills, push must-be-perfect operations into code, and keep the harness thin so every model improvement flows through automatically.
Companion tweet (same framing, compressed): @garrytan, "Thin harness, fat skills".
This is a sharp disagreement with framings that treat validation, repair, and context routing as constitutive of capability. In Tan's picture, most of that work is either absorbed by the next model or revealed as compensation for a weaker one. In the constitutive picture (Wallach/Jacobs et al.; Salaudeen et al.), those loops are where capability lives in practice — no matter how strong the base model.
Keep this entry visible when reading sources that argue the opposite. It marks the pole the library should preserve, not flatten.
Open questions
- Under what domain conditions is "thin harness" actually enough? (Hypothesis: high offline evaluability, low institutional ratification cost.)
- Does "fat skills" degrade gracefully when inputs are illegible or reward signal is thin?
- What's the smallest counterexample — a task where a fat harness around a weaker model beats a thin harness around a stronger one and continues to beat it as models improve?
Related entries
- Hermes Agent READMENous Research · 2026-04-28#skillscapability-is-extendedfirst-mile-input-formationdiffusion-adoption-bottleneckexecution-harnesslearning-harnessinterface-harness
- Equipping agents for the real world with Agent SkillsAnthropic · 2025-10-15#markdown-skillscapability-is-extendedfirst-mile-input-formationdiffusion-adoption-bottleneckexecution-harnesslearning-harness
- Claude Skills are awesome, maybe a bigger deal than MCPSimon Willison · 2025-10-15#markdown-skillscapability-is-extendedfirst-mile-input-formationdiffusion-adoption-bottleneckexecution-harnesslearning-harness
- What Is an Agent HarnessAparna Dhinakaran · 2026-04-21#skillscapability-is-extendeddiffusion-adoption-bottleneckexecution-harnesslearning-harnessinterface-harness
Overlap is computed on tags, relation-to-argument, and harness types — not on role or domain, because contrasts are often the most useful neighbours.