Skill Issue: Harness Engineering for Coding Agents
Skills, MCP servers, sub-agents, hooks, and back-pressure mechanisms are tactical solutions HumanLayer has arrived at.
Case-study framing of harness engineering for coding agents, with specific claims about what does and does not work (notably: role-based sub-agents don't work; sub-agents for context control do).
Classification
- Role
- case-study
- Domain
- software
- Source type
- blog
- Harness types
- execution-harnessrepair-harnessmonitoring-harnessinterface-harness
- Validation position
- during-generationimmediately-after-generationpost-deployment
- Validation mode
- empiricalmechanical
- Prescription stance
- strongly-procedural
- Relation to argument
- capability-is-extendedrepairability-mattersobservability-mattersbreakdown-when-harness-absent
- Tags
- harness-engineeringcoding-agentssub-agentscontext-controlback-pressure
Extended capability commentary
- Input legibility
- Task structure
- Breaking work into discrete delegated tasks is a first-class move here.
- Reward richness
- Feedback latency
- Repairability
- Back-pressure mechanisms are repair harness by another name.
- Observability
- Offline evaluability
- Institutional ratification
Why it matters
A strong counter-example to thin-harness-in-the-limit. HumanLayer has shipped coding-agent product and reports that sub-agents, hooks, and back-pressure do real work. Sharpens the disagreement with Tan/Miessler and localises it.
Annotation
HumanLayer's post is the library's best current counterweight to the thin-harness pole. The claim is not that more harness is always better — they explicitly report that role-based sub-agents ("frontend engineer," "backend engineer") don't work. The claim is that specific harness moves — sub-agents as context-control, hooks, back-pressure — carry real load and cannot be absorbed into a better model.
The piece is useful for the library because it:
- Distinguishes harness types that work from those that don't, empirically rather than in principle.
- Names specific mechanisms (sub-agents-for-context, back-pressure) that belong on the
harness_typestaxonomy. - Speaks from shipped product, which raises its weight on the
practitioner-notevsframework-pieceaxis.
Read alongside
- Tan, "Thin Harness, Fat Skills" — the opposite pole.
- Miessler, "Good and Bad Harness Engineering" — the pole this piece is most compatible with.
- Anthropic, "Agent Skills" — the vendor framing for one of the tactical solutions cited.
Related entries
- What Is an Agent HarnessAparna Dhinakaran · 2026-04-21#coding-agentscapability-is-extendedrepairability-mattersobservability-mattersbreakdown-when-harness-absentexecution-harnessrepair-harnessmonitoring-harnessinterface-harness
- Good and Bad Harness EngineeringDaniel Miessler · 2025-08-31#harness-engineeringcapability-is-extendedrepairability-mattersobservability-mattersbreakdown-when-harness-absentexecution-harnessrepair-harnessmonitoring-harness
- LLM Knowledge BasesAndrej Karpathy · 2026-04-01capability-is-extendedrepairability-mattersobservability-mattersexecution-harnessrepair-harnessmonitoring-harnessinterface-harness
- An open-source spec for Codex orchestration: SymphonyAlex Kotliarskyi, Victor Zhu, and Zach Brock · 2026-04-26capability-is-extendedrepairability-mattersobservability-mattersexecution-harnessrepair-harnessmonitoring-harnessinterface-harness
Overlap is computed on tags, relation-to-argument, and harness types — not on role or domain, because contrasts are often the most useful neighbours.