The Extended Frontier: The Gate That Makes You Type

·Daniel Griffin·Hypandra·6 min read

Working Draft

This post is a working draft, developed collaboratively with Claude Opus 4.6 (1M context) in Claude Code (Anthropic) across multiple sessions, using a variety of extensions: persona-based reviews, citation verification, AI detection analysis (Pangram Labs), deep research reports, and dissertation search (qmd). The argument, evidence curation, and editorial direction are Daniel’s; much of the prose was initially generated by Claude and is being iteratively rewritten. A changelog tracks the development. We will continue to write about how we’re exploring this idea—the process is part of the argument.

Post 5 cited Goddard et al.'s finding that undetected automation failures increase subjective trust. The Dutch experiment showed the fix: salient, usable alternative cues that keep verification instincts alive. manrun is the implementation—a CLI gate that blocks agent-initiated production deploys and forces the human to type a confirmation sentence with randomized code words. The first real use was approving a Vercel production deploy for a civic permit preflight demo. The tool doesn't add information. It adds friction—and the friction is the extension.


A note on process

The first draft of this post was generated by Claude Opus 4.6 running inside my rig harness—the same system the post describes. I directed the scope, revised through multiple passes, and am responsible for what's published here. The process is itself an instance of the argument the series makes.

Changelog
  • 2026-05-03 — First draft generated by agent, revised by Daniel.
*This is part of [The Extended Frontier](/2026/03/24/extended-frontier) series.*

The literature said to build this

Post 5 cited Goddard et al.'s systematic review of 74 automation bias studies. The finding that matters here: when automation failures go undetected, subjective trust increases. Not stabilizes—increases. The user encounters reliable performance, doesn't encounter failures (because they're invisible, not absent), and calibrates trust upward. The checking atrophies.

Post 5 also cited the counter-evidence. The Dutch public-sector experiment showed automation bias disappearing when salient, usable alternative cues were provided alongside AI recommendations. The bias isn't intrinsic to using AI. It's about whether the environment keeps the practitioner's verification instincts alive on the dimensions the primary extensions don't cover.

And then the aviation qualifier: pilots committed errors 55% of the time even when correct cross-check information was available. Extensions must be engaged, not just available.

I read those studies. I wrote about them. Then I built the tool they described.

What manrun does

My agent harness runs hooks—shell scripts that fire before certain tool calls. One hook, lint-vercel-prod, blocks production deploys. When an agent session decides to deploy to production, the hook intercepts the command, writes context to a stash file, and tells the agent: "Blocked. The human needs to approve this in a separate terminal."

manrun is the gate the human runs. It reads the stash and presents:

  • WHAT the command does ("Redeploys permit-preflight to production on Vercel")
  • WHY it was blocked ("Prod deploys normally go through GitHub PR merges so code review and CI run first")
  • WHEN it's okay ("Env-var-only changes that need a redeploy. Emergency hotfixes where the PR path is too slow")
  • What the agent claims ("Initial deploy of ORD-206 permit preflight demo—static HTML, noindex, needs subdomain")

Then it asks you to type. Not click. Type.

Step 1: the project name. Step 2: a word puzzle. Four random words—"chalky pelican bouncy walrus"—and an instruction like "type the 1st and 4th." The full confirmation sentence: "I am approving a production deploy on permit-preflight with code chalky walrus."

The words change every time. The position selection is randomized. You cannot muscle-memory through it. You have to read, select, compose, type.

Why typing matters

The design choices are all responses to the literature.

Salience. Goddard's finding implies that the intervention must interrupt the flow of smooth operation. A "Continue? [Y/n]" prompt fails this test—the hand is already in motion, the "y" is reflexive. manrun requires opening a new terminal tab, reading a full-screen summary, and composing a sentence. The physical interruption is the point.

Usability. The Dutch experiment succeeded because the alternative cues were usable, not just available. manrun surfaces the hook's reasoning—why this command was blocked, when it's appropriate to approve—so the human has the information needed to make a real decision. The decision aid and the gate are the same object.

Engagement. Aviation data shows that availability isn't engagement. The randomized word puzzle ensures cognitive engagement. You can't approve without reading which words to use and in what order. The irrelevance of the words is deliberate—"chalky walrus" has no semantic relationship to production deploys, which means you must actually parse the instruction rather than pattern-match against expectations.

Irregularity. The position selection changes every invocation. First and fourth, fourth and first, second and third. The human can't habituate to a fixed pattern. This addresses the specific failure mode Goddard identified: trust calibrating upward through repeated smooth interactions. Each manrun interaction is different enough to require fresh attention.

The first real use

Saturday, May 3. I had a Codex session building a civic permit preflight demo—a static HTML artifact for a business meeting. The session finished the build and tried to deploy to Vercel production. The lint-vercel-prod hook fired. Command stashed.

I opened a new tab. Ran manrun. Read the summary: static HTML page, noindex meta tag, new subdomain, no user data at risk. The agent's reason made sense. The project name was right. The code words were "chalky pelican bouncy walrus," positions 1 and 4.

I typed: "I am approving a production deploy on permit-preflight with code chalky walrus."

Deploy ran. Site went live. Total added time: maybe forty seconds.

Was the friction necessary? The deploy was fine. Nothing bad would have happened if the agent had just run it. That's exactly the condition under which automation bias accumulates. The reliable outcomes train you to stop checking. The gate's value isn't in catching the one bad deploy—it's in keeping the checking alive for when it matters.

What this shows about extensions

manrun is an extension in the framework's sense. It has specific capacities: it grounds "did the human consciously approve this production action?" It doesn't ground "is the code good?"—that's what CI and code review do. It doesn't ground "is this the right feature to build?"—that's what /pitch-me and order files do.

What it uniquely does is address the trust drift problem Post 5 identified. Extensions that catch most errors create conditions for overreliance on the dimensions they don't cover. The human's experience of smooth operation—green CI, passing tests, successful deploys—trains them out of the verification habits that would catch the error the extensions miss.

manrun is a meta-extension: an extension whose capacity is keeping the human engaged in the act of approving, rather than checking any particular property of what's being approved. It's not adding information the human doesn't have. It's adding friction that the literature says is necessary to maintain the human's role as a genuine check rather than a rubber stamp.

The Dutch experiment is the closest analogue. Those researchers provided salient cues alongside AI recommendations and automation bias disappeared. manrun provides the cues (what, why, when-ok) alongside the agent's proposed action, then adds the engagement guarantee (type the sentence) that aviation research says mere availability doesn't provide.

The design constraint from aviation

Pilots with correct cross-check information still erred 55% of the time. The information was available. The conditions for engaging with it—workload, time pressure, interface design—weren't.

manrun addresses this by controlling the conditions. You run it in a separate terminal, at your own pace, with nothing else competing for attention. There's no countdown. No flashing warning you need to dismiss to get back to your real task. The gate is the task until you complete it. The design removes the conditions that aviation research identified as undermining engagement with available information.

Is this sufficient? I don't know. The tool has been used once. The theory says it should work—but theory said available cross-check information should work too, and it didn't always. Whether manrun's particular design actually maintains verification quality over months of smooth operation is an empirical question I can't answer yet.

Where this connects

Post 5 made a theoretical argument: extensions have specific capacities, and confidence from covered dimensions generalizes to uncovered ones. Trust drift means the absence of detected errors increases confidence rather than maintaining vigilance.

DSG-26 showed what happens when orientation extensions don't cover a dimension—the /pitch-me skill had no capacities tuned to research publishing, and the human redirected in one sentence.

This post sits between theory and longer-term evidence. The tool is built. The theory predicts it should help. The first use felt right—the forty seconds of reading and typing gave me a moment to actually consider whether this deploy should happen, rather than assuming it should because everything else had gone well.

That's thin evidence. One use, one person, one deploy that would have been fine either way. But thin evidence of the right kind is how the library's shelving analysis says practice knowledge starts: observe the mechanism in action, document what happened, let time and repetition either confirm or complicate the initial observation.

The gate made me type. I had to read the words, pick two, compose a sentence. For forty seconds, I was a human making a decision rather than a human approving a machine's decision. The literature says that distinction matters. I built the tool because I believed it. Now I'm watching whether it holds.