The Extended Frontier: What Else Changes

Changelog

2026-03-24 — Initial draft published.

This is part of [The Extended Frontier](/2026/03/24/extended-frontier) series. [Post 1](/2026/03/24/extended-frontier) argued that the jagged frontier is predictable from the extension structure of the work. [Post 2](/2026/03/24/extended-frontier-first-mile) looked at the input side—the first mile problem, where what you ask matters as much as what you get back. [Post 3](/2026/03/24/extended-frontier-repairability) examined repairability—whether mistakes, once caught, can actually be fixed.

This post is about what we're not measuring.

What Dell'Acqua and colleagues measured

The BCG experiment (Dell'Acqua, McFowland, Mollick et al., 2023) measured two things: productivity and quality of task output. Consultants using GPT-4 finished more tasks, faster, and their output was rated higher. Inside the frontier, AI helped. Outside it, consultants who followed the AI did worse than those working alone.

Those findings matter. They established that the frontier exists and that it's consequential. But productivity and quality of task output are one outcome among many. When a function transfers from one way of working to another, more changes than the output.

The handoff analytic

Mulligan and Nissenbaum (2020) introduced the handoff analytic to make this precise. Their core claim: when a function shifts from one component to another in a sociotechnical system, functional equivalence does not guarantee ethical equivalence. The reconfiguration changes things. Their question is simple and generative: what changes when the function moves?

The analytic draws on a lineage of sociotechnical thinking—Lave and Wenger's legitimate peripheral participation, Suchman's situated action, Hutchins' distributed cognition, Sellen and Harper's affordance analysis. What these share is attention to the full ecology of a practice, not just its outputs. They ask what the work does beyond producing a deliverable. Who learns from it. What relationships it sustains. What pace it enforces. What accountability structures it carries.

In my dissertation, the handoff analytic is what revealed the extensions in the first place. Once you ask "what changes when search results move from one context to another," you see that the knowledge making search work isn't in the searcher's head—it's distributed across testing practices, code review, team meetings, naming conventions. The extensions became visible as a straightforward consequence of using a sociotechnical lens. Without that lens, you'd measure whether the search results were good. With it, you see everything the practice does to make them usable.

The same applies to AI. If you only measure output quality, the BCG experiment looks like a story about where AI helps and where it doesn't. If you apply the handoff analytic, it becomes a story about what else shifted when consulting tasks moved into an AI-assisted workflow—shifts the experiment wasn't designed to detect.

Skill formation

An Anthropic study (2026) tracked 52 junior software engineers working with AI coding assistants. The ones who delegated to AI—generating code and moving on—scored 17 percentage points lower on comprehension assessments than those who didn't use AI (50% vs 67%, p=0.01).

What makes this an extensions story: when learners used a different interaction pattern—generating code with AI and then asking follow-up questions about what was generated—their comprehension was preserved. Same model. Same task domain. The variable was whether the interaction pattern maintained the learning extension or stripped it.

A preprint from MIT Media Lab, "Your Brain on ChatGPT" (2025), points in a similar direction: LLM-assisted writers showed reduced brain connectivity compared to unassisted writers, and most couldn't recall passages from essays they had just written. It's one study, and a preprint, so the specifics deserve caution. But the pattern is consistent with the Anthropic finding: something about the cognitive process is different when AI mediates the work, even when the output looks the same.

In both cases the deliverable was fine. What the practice normally does for the practitioner was diminished. A productivity-and-quality lens can't see this. The handoff analytic can, because it asks: what did this practice do besides produce output?

Craft and identity

When AI enters creative and skilled work, the relationship between the practitioner and the work shifts. "Writing code" becomes something closer to sampling, selecting, integrating. A survey of artisans and craftspeople found that 67% felt their work "lost soul" when AI was involved. The object might be equivalent. The experience of making it isn't.

The practical consequence: if the practice of doing skilled work is what produces skilled workers, and that practice is being restructured so the skill-building aspects are bypassed, the pipeline of future practitioners is affected. The handoff analytic asks: when the function of code-writing transfers to an AI-assisted workflow, does the function of skill-development transfer with it? The Anthropic study suggests it depends entirely on the interaction pattern—the extension structure of the new practice.

Pace and boundaries

One thing that keeps showing up in practitioner accounts is what happens when the slow parts of work go away. The slow parts weren't just slow. They were doing something.

Ethnographic research reported in Harvard Business Review found that AI in knowledge work "increases pace, expands task scope, extends work into more hours." Reducing the cost of producing a draft doesn't mean fewer drafts and an earlier evening. It means more drafts, more tasks, and an expanding workday.

One developer in a practitioner account I collected put it this way: "The innate slowness of coding helps with letting things 'simmer' in the back of the mind." That simmering is a temporal extension—a reflective pause built into the pace of the work. It's not inefficiency. It's where certain kinds of thinking happen. When AI compresses the production timeline, the reflective extension gets removed. Speed becomes a stripped extension.

The handoff analytic asks: when production-speed increases, does the reflective function that slowness provided transfer to some other part of the workflow? Or does it just disappear? If it disappears, the output might be fine while the practitioner's relationship to the work degrades—more output, less thought, expanding hours.

Accountability

Madeleine Clare Elish (2019) introduced the concept of "moral crumple zones"—points in a sociotechnical system where human operators absorb blame for system failures they couldn't meaningfully prevent. The metaphor is from automotive safety: crumple zones absorb impact. In human-AI systems, the human absorbs liability.

"Human-in-the-loop" designs can create exactly this structure. The human is nominally responsible for oversight but lacks the time, information, or practical ability to intervene meaningfully. The function of accountability is formally assigned but practically hollowed out. The handoff analytic asks: does accountability transfer with the function, or does it get redistributed in ways that create exposure without control?

This connects to the extensions framework directly. In the BCG experiment, consultants working alone were accountable for their output through normal professional mechanisms—their name on the deliverable, their reputation, their firm's quality standards. When AI entered the workflow, the formal accountability didn't change (still their name on it) but the practical conditions for exercising it did. They couldn't fully evaluate what the AI contributed. The accountability extension was formally present but practically degraded.

Repairability is another dimension here—Post 3 treated it at length. When AI-assisted speed compresses the review window, the repair loop degrades even if the average output is fine.

What the BCG study wasn't designed to see

None of this is a criticism of Dell'Acqua, Mollick, and colleagues. They measured what they set out to measure, and what they found was important. The jagged frontier is real and consequential.

But when the BCG experiment is taken as the primary frame for understanding AI in work—which it often is, given how widely it's been cited—then the dimensions it didn't measure become invisible. Productivity went up. Quality was maintained or improved (inside the frontier). So what's the problem?

The handoff analytic says: those measurements tell you about one function among several. Consulting builds junior consultants' skills. It sustains craft identity. It operates at a pace that allows reflection. It carries accountability structures. Each of these is a function that may or may not survive the reconfiguration.

The Abdu et al. (2024) paper on the Census Bureau's shift to differential privacy is the clearest parallel. The shift was functionally equivalent—privacy was preserved. But the handoff analytic revealed that the reconfiguration made policy levers less visible and redistributed decision-making authority. Output equivalence masked a governance shift. That's the pattern here too: output equivalence masking shifts in skill, craft, pace, accountability, and repair.

The lens that makes it visible

I keep returning to the handoff analytic because it's doing real analytical work, not adding a normative overlay. These dimensions—skill, craft, pace, accountability, repair—are structurally part of what changes when AI enters a practice. Without a sociotechnical lens they're invisible to the evaluation.

The earlier posts in this series covered where the frontier is jagged (Part 1), why the first mile matters (Part 2), and whether errors can be recovered (Part 3). This post is about the dimensions beyond the output.

Together they say: the jagged frontier is not a single line measuring task performance. It's a multidimensional surface. A practice can be smooth on output quality and jagged on skill formation. The same AI system, in the same workflow, producing good output, can still be degrading the practice along dimensions that output metrics don't capture.

What I'm working on

The full argument—extensions, inputs, repairability, and the multidimensional frontier—is being developed as a paper with Deirdre K. Mulligan, my dissertation co-chair and co-author of the handoff analytic. The handoff analytic is what connected these pieces.

The practical question is whether this framework can be made operational—useful for people designing AI-assisted workflows, not just describing them after the fact. If the handoff analytic tells you what to look for, and the extensions framework tells you where to look, then together they should give practitioners and policymakers something concrete: before deploying AI into a practice, map the extensions, ask what else the practice does beyond producing output, and check whether those functions survive the reconfiguration.

That's what I think is at stake. Not whether AI makes output better—often it does. But whether the other things the work was doing survive the process.