Good and Bad Harness Engineering
In the early days of prompt engineering (2023-2024) it was helpful to tell AI exactly how to do things, but this inversion probably happened somewhere in 2025.
Argues that good harness engineering focuses on who the user is and what they're trying to accomplish — the 'what' — and lets the model handle the 'how'. Pairs with Miessler's 'Bitter Lesson Engineering' as a design discipline for scaffolding that extends capability rather than compensating for model weakness.
Classification
- Role
- framework-piece
- Domain
- cross-domain
- Source type
- essay
- Harness types
- input-shapinggrounding-context-loadingexecution-harnessvalidation-harnessrepair-harnessmonitoring-harness
- Validation position
- before-generationimmediately-after-generationpost-deployment
- Validation mode
- empiricalmechanical
- Prescription stance
- mixed
- Relation to argument
- capability-is-extendedrepairability-mattersobservability-mattersbreakdown-when-harness-absent
- Tags
- harness-engineeringbitter-lessondesign-disciplineagent-designwhat-not-how
Extended capability commentary
- Input legibility
- Treats input formation as part of the engineered system, not preprocessing.
- Task structure
- Reward richness
- Feedback latency
- Repairability
- Observability
- Reversibility
- Offline evaluability
- Institutional ratification
Why it matters
Supplies the vocabulary for distinguishing harnesses that *extend* capability from harnesses that merely *compensate* for it. A critical lens for reading practitioner writing.
Annotation
Stakes out the middle ground between "thin harness, fat skills" and fully prescriptive agent frameworks. The core move is a good/bad distinction inside harness engineering itself: some scaffolding genuinely extends what the system can do (input shaping, repair loops, observability), while other scaffolding is brittle compensation for current model weakness and will not survive the next model.
Miessler's design rule is compressed into one line: don't confuse the what with the how. Tell the model who you are and what outcome you want; let the model figure out the path.
Read together with:
- Bitter Lesson Engineering — the underlying argument, leaning on Sutton's "The Bitter Lesson."
- Tan's "Thin Harness, Fat Skills" — adjacent but less prescriptive-about-good-design.
Miessler is not endorsing the thin-harness conclusion that scaffolding is always waste. He is endorsing a discipline of harness design. The disagreement with Tan is legible: both agree some scaffolding is waste; they disagree about how much of the harness is waste in the limit of model improvement.
What the library should extract once the post is fully read
- The explicit taxonomy (if any) of good vs. bad harness work.
- Concrete examples cited as each type.
- Whether repairability and observability are treated as constitutive of capability or merely as hygiene.
Related entries
- Skill Issue: Harness Engineering for Coding AgentsHumanLayer · 2026-02-28#harness-engineeringcapability-is-extendedrepairability-mattersobservability-mattersbreakdown-when-harness-absentexecution-harnessrepair-harnessmonitoring-harness
- What Is an Agent HarnessAparna Dhinakaran · 2026-04-21capability-is-extendedrepairability-mattersobservability-mattersbreakdown-when-harness-absentinput-shapinggrounding-context-loadingexecution-harnessvalidation-harnessrepair-harnessmonitoring-harness
- LLM Knowledge BasesAndrej Karpathy · 2026-04-01capability-is-extendedrepairability-mattersobservability-mattersgrounding-context-loadingexecution-harnessvalidation-harnessrepair-harnessmonitoring-harness
- Hermes Agent READMENous Research · 2026-04-28capability-is-extendedrepairability-mattersobservability-mattersinput-shapinggrounding-context-loadingexecution-harnessrepair-harnessmonitoring-harness
Overlap is computed on tags, relation-to-argument, and harness types — not on role or domain, because contrasts are often the most useful neighbours.