If the extensions framework is right, it should be falsifiable. Here's the specific within-domain prediction: the same task, in the same domain, will show different AI performance depending on whether the practice's extensions are engaged. Not across domains—within one.
Cross-trained practitioners—doctors who code, lawyers who build tools—see the frontier differently because they carry extensions from one domain into another. Their accounts reveal the mechanism: it's not the model that differs, it's what the practice provides.
The AI discourse obsesses over output quality. But for most of life, the harder problem is upstream—how do you turn 'something feels wrong' into a question worth asking?
Verification catches the error. Repairability determines whether you can do anything about it. They're different properties of the work, and repairability is itself an extension—one that can be designed in or absent.
Task-exposure models count what AI can do. Bundle theory asks whether the task can be separated from the job. The extensions framework asks a different question: does 'can do' mean the same thing with and without the practice's feedback loops?
Productivity and quality are one outcome among many. When AI enters a practice, skill formation, craft, pace, accountability, and repairability all shift. The handoff analytic is what makes these dimensions visible.
Each extension has specific capacities. The compiler grounds 'does it run?' A security review grounds 'is it safe?' Map the capacities of the extensions in a practice, and you've mapped where the frontier is smooth and where it isn't. The handoff analytic reveals what happens when those capacities shift.
The jagged frontier of AI capability isn't random. It's predictable from the practices, artifacts, and feedback loops that constitute actual work. The frontier is smooth where work is extended. It's jagged where work has been stripped to an isolated task.