The Verification Economy and the Intelligence Crisis

The Question

At some point in 2027 or 2028, the cost of producing a unit of measurable cognitive work will drop below the threshold where it’s cheaper to have a machine do it than a person. What happens after that is where people start disagreeing, and disagreeing badly.

One camp sees catastrophe. Citrini Research’s “The 2028 Global Intelligence Crisis,” which has been making the rounds, lays out a deflationary spiral: AI replaces workers, workers stop spending, demand collapses, and the economy enters a structural recession that conventional policy can’t fix. The other camp, drawing on the framework in “Some Simple Economics of AGI,” thinks the real bottleneck isn’t abundant machine intelligence but the scarcity of institutional capacity to verify what that intelligence produces. Can we certify it? Can we take responsibility for it? Who bears the liability when it’s wrong?

Both camps agree on the technology. They disagree on what’s actually scarce in the economy that follows. That disagreement isn’t optimism versus pessimism. It’s a structural argument about where value accumulates when intelligence gets cheap.

Getting this wrong has consequences. The demand-collapse scenario calls for massive fiscal intervention to replace lost consumer spending. The verification scenario calls for building out liability frameworks, certification regimes, and audit infrastructure. Preparing for the wrong one is worse than doing nothing.

I want to try to adjudicate here. What follows subjects the Citrini scenario to sustained examination through the verification lens, identifies where each framework is strongest, diagnoses where each breaks down, and constructs a revised picture of the late 2020s that takes the best of both.

The Declining Cost of Measurable Execution

The empirical starting point isn’t in dispute. On the SWE-bench software engineering benchmark, frontier model accuracy went from 4.4 percent to 71.7 percent in roughly a year. Agent task horizons (the length of time a system can operate autonomously before it needs a human) have been doubling sub-annually. Models are contributing to their own development. Inference costs for frontier models have been dropping at about 10x per year for equivalent capability, with no sign of leveling off.

In plain economic terms: any cognitive task whose output can be scored against clear criteria is getting radically cheaper. Marketing copy, legal research memos, first-draft financial models, functional software prototypes, insurance claims processing, logistics coordination, document summarization, support ticket triage. The list keeps growing. For each, the cost curve is heading toward something close to zero marginal cost, bounded only by compute and energy.

Citrini takes this observation and runs it forward. By late 2025, agentic coding tools hit a “step function jump.” Enterprise procurement teams start using AI to replicate six-figure SaaS contracts in-house. ServiceNow’s Q3 2026 earnings report, in the scenario, reveals the mechanism: the same AI headcount reductions that boost margins at customer firms mechanically destroy the revenue base of software vendors selling per-seat licenses. The per-seat model, which propped up the entire SaaS valuation framework, starts to unravel.

From software, the disruption spreads. By early 2027, AI agents handle consumer decisions autonomously, processing 400,000 tokens per person daily. Travel booking, insurance renewals, financial advice, real estate commissions, food delivery, payment processing: each depends on friction, on the gap between what a consumer could theoretically achieve with perfect information and what they actually achieve given limited time and attention. AI agents close that gap. Business models built on human limitations begin to crumble.

So far, so plausible. The question is what follows.

Two Theories of What Goes Wrong

The Demand-Collapse Model

Citrini’s answer is the “Intelligence Displacement Spiral,” a feedback loop with no natural brake. AI gets better. Firms cut headcount for margin. Displaced workers spend less. Weaker demand compresses margins elsewhere. Those firms invest harder in AI. Capabilities improve again. The loop tightens.

The scenario’s projected consequences are dire. By June 2028: unemployment at 10.2 percent, S&P 500 down 38 percent from its October 2026 highs, labor’s share of GDP collapsed from 56 percent to 46 percent (the sharpest decline in modern history). The $2.5 trillion private credit market, loaded with PE-backed software companies whose recurring revenue assumptions have failed, is in crisis. The $13 trillion mortgage market is under stress as prime borrowers face structural income impairment. The federal tax base, which is basically a tax on human time, erodes just as demand for social spending explodes.

At the center sits “Ghost GDP”: aggregate output that shows up in national accounts but never circulates through households because the machines generating it don’t buy anything. Machines spend zero on discretionary goods. Money velocity flatlines. The circular flow of the economy breaks.

The Verification-Constraint Model

The alternative starts somewhere different. For any task with real economic consequences, execution is only half of what matters. The other half is verification: figuring out whether the work was correct, reliable, safe, aligned with the principal’s actual interests, and fit for purpose. A legal brief must be written and then reviewed. A diagnosis must be generated and then confirmed. A financial model must be built and then audited. Software must be coded and then tested under conditions that matter.

Execution costs are falling toward zero for anything that can be scored against clear metrics. Verification costs follow a different curve entirely. They’re bounded by human cognition, institutional processes, liability structures, regulatory requirements, and the slower pace at which trust accumulates. The cost to verify bends slowly. It depends on how fast human experts can evaluate output, how long feedback loops take in complex systems, and the pipeline through which new verifiers get trained, a pipeline that itself depends on the apprenticeship structures cheap execution is disrupting.

This asymmetry creates what the framework calls the “measurability gap”: a widening spread between the cost of producing output and the cost of certifying that the output is trustworthy. As this gap grows, the economy doesn’t simply automate. It reorganizes around a new scarcity. Intelligence, once the binding constraint on growth, becomes abundant. Verification becomes the bottleneck.

The practical upshot is that labor displacement gets governed by institutional verification capacity, not by AI capability. Firms can’t just fire everyone and replace them with agents, because someone still has to verify what the agents produce, certify it, and bear liability when it’s wrong. Displacement proceeds task by task, domain by domain, at the speed of verification infrastructure buildout, not at the speed of capability curves.

The Measurability Fault Line

The verification framework’s most important prediction is about which tasks are vulnerable and which aren’t. The conventional dividing lines — digital versus physical, routine versus creative, low-skill versus high-skill — are secondary. The primary fault line is measurability.

Any task whose output can be scored, ranked, or evaluated against well-defined criteria can, in principle, be industrialized. This includes work conventionally considered high-status or analytically complex: legal research, diagnostic radiology, financial modeling, marketing campaign design, even parts of strategic planning. The condition is that the output has to be evaluable.

Tasks whose outputs resist measurement stay resistant to automation regardless of model capability. A corporate board’s value doesn’t lie in the analytical quality of its recommendations (an AI could match or exceed that) but in the legal and fiduciary accountability that board members personally bear. A surgeon’s value isn’t purely technical skill but the liability structure and patient trust attached to a named human. A senior auditor’s value isn’t computational thoroughness but the professional certification and legal standing that make the audit matter.

This gives you a four-quadrant map of the economy. Low automation cost and low verification cost (simple digital tasks like image generation, chat, short code snippets): rapid AI adoption. Low automation cost but high verification cost (complex autonomous operations, high-stakes decisions, anything involving liability): bottleneck forms. High automation cost (physical tasks requiring dexterous robotics): slow AI impact regardless. The most consequential quadrant, and the one most relevant to the Citrini scenario, is where execution has become cheap but verification remains expensive and scarce.

Where Citrini Gets It Right

Before getting into where the scenario fails, it’s worth being clear about what it gets right, because the correct observations are substantial.

The reflexive adoption loop is real. When firms see rivals cutting costs through AI, the pressure to follow is intense and rational. Citrini’s account of how threatened firms become AI’s most aggressive adopters, creating a coordination problem where individually rational decisions produce collectively bad outcomes, is well-grounded. No single firm can slow down unilaterally without losing share.

Intermediation disruption is already visible. Many business models rest on friction. AI agents that close the information gap will compress margins in travel booking, insurance distribution, financial advice, and similar sectors. The magnitude is debatable; the direction isn’t.

The distributional diagnosis is sound. An economy where productivity gains flow mostly to capital owners while labor income stagnates has a demand problem. Citrini’s concern about the circular flow getting disrupted if gains aren’t broadly shared reflects standard Keynesian reasoning, backed by decades of evidence on the link between income distribution and aggregate demand.

The software sector analysis is sharp. The per-seat SaaS model is genuinely vulnerable. When AI-assisted internal development can replicate mid-market SaaS at a fraction of subscription cost, the valuation framework supporting a multi-trillion-dollar software industry comes under pressure. Software is among the most measurable domains in the economy, and verification of “does the code work?” is comparatively cheap, which means the verification bottleneck is weakest here. Citrini’s prediction of software disruption is actually well-supported by the verification framework.

Ghost GDP captures something real, even if the metaphor is analytically imprecise. Productivity gains that don’t translate into broadly distributed income do weaken the consumer economy. The mechanism isn’t as clean as Citrini presents (profits don’t simply vanish; they’re reinvested, distributed, or taxed), but the distributional concern is legitimate.

The “missing junior loop” is a structural vulnerability both frameworks identify. Firms that automate entry-level work simultaneously dismantle the training pipelines that produce future experienced professionals. This is a slow-motion problem that doesn’t need any particular macro scenario to be damaging.

Where Citrini Goes Wrong

The Intelligence-to-Output Conversion Error

The deepest analytical problem in the Citrini scenario is the assumption, mostly implicit, that intelligence converts efficiently to economic output. If AI can do the cognitive work, the economic value follows automatically.

The verification framework says this is a category error.

Intelligence is necessary but not sufficient. Economic output requires generating work product and then validating it, integrating it into institutional processes, certifying it for regulatory compliance, and assuming liability for the consequences. When execution is abundant, the marginal value of more execution approaches zero. Value migrates to what remains scarce: verification, certification, judgment under ambiguity, accountability.

Take Citrini’s central example: a Claude agent replacing a $180,000 product manager for $200 per month. In the demand-collapse model, the replacement is clean. The agent does the work, the worker is displaced, $179,800 per month exits the circular flow.

But product management involves reconciling ambiguous stakeholder priorities, navigating organizational politics, making long-horizon strategic bets, and bearing personal accountability for outcomes that may not be measurable for months or years. Can the firm verify the agent’s output is correct? That it hasn’t introduced subtle misalignments with unstated objectives? That the resulting decisions are legally defensible? That someone bears professional liability?

For many product management functions, the answer is no. The displacement is partial at best. The firm uses the agent to boost the product manager’s execution capacity while keeping the human for verification, judgment, and accountability. Wage compression may be real, but the displacement isn’t the clean unit-for-unit substitution Citrini’s model requires.

This pattern repeats across the white-collar economy. Displacement concentrates in the most measurable roles: junior legal research, routine financial analysis, first-draft content, standard code generation, administrative coordination. It’s much slower in roles where verification is expensive: strategic planning, client relationships, complex negotiation, high-stakes professional judgment. Citrini’s model treats these as one category. They aren’t.

The Institutional Inertia Assumption

The Citrini scenario holds three variables fixed while letting AI capability compound: policy response, labor market adaptation, and demand recycling. This modeling choice does enormous work in generating the catastrophic outcome.

The assumption is hard to defend. Institutions aren’t passive recipients of technological shocks. They’re adaptive systems that respond to economic signals, with delay. “Slower” is different from “nonexistent.” Citrini’s scenario requires that institutions effectively don’t respond at all over a two-year period during which unemployment more than doubles, equities fall nearly 40 percent, and the mortgage market comes under systemic stress.

History doesn’t support this. The 2008 crisis produced TARP, emergency Fed lending facilities, fiscal stimulus, and FDIC deposit guarantees within months. The COVID-19 pandemic produced $5 trillion in fiscal response in under a year: direct payments, enhanced unemployment insurance, the Paycheck Protection Program. Every major postwar recession has produced policy responses that, however imperfect and delayed, materially altered the downturn’s trajectory.

Citrini’s scenario requires the political system to watch unemployment double and the mortgage market destabilize and do approximately nothing for two years. Not impossible. But it’s a strong assumption that should be stated as an assumption rather than buried as a modeling choice.

A Revised Scenario for the Late 2020s

Drawing on both frameworks, you can construct a scenario for the late 2020s that is more internally consistent than either alone.

The labor market splits along the measurability fault line. Employment in highly measurable execution roles declines 15 to 25 percent in affected occupations, mostly through hiring freezes. Employment in verification-intensive roles grows. Net aggregate unemployment rises to perhaps 5.5 to 6.5 percent: meaningful but manageable, a recession rather than a crisis.

Wage dynamics are more complex than either model alone predicts. Execution-oriented workers experience significant real wage compression. Verification-oriented workers — senior auditors, AI safety engineers, compliance officers, experienced professionals whose judgment can’t be easily replaced — command rising premiums. Aggregate labor share of income declines by two to four percentage points, concentrated in specific occupational categories. The Gini coefficient increases. The political consequences of this inequality are real and potentially severe, but they’re distributional problems, not an aggregate demand collapse.

The characteristic firm organization becomes a “thin firm”: a small core of senior professionals exercising verification and judgment, supported by extensive AI execution systems and a modest layer of junior workers in structured training designed to develop verification skills. The traditional pyramid gets replaced by a diamond: few juniors, an AI execution layer, many verification-oriented mid-level professionals, a smaller senior leadership team.

Economic rents migrate from intelligence generation to verification and certification. Professional certifications become more valuable. Firms with proprietary verification systems earn returns well above cost of capital. Insurance companies underwriting AI-generated outputs become central to economic functioning. The intermediation layer transforms rather than collapses.

What Remains Uncertain

Both frameworks depend on assumptions that could go either way.

The biggest wildcard is whether AI learns to verify its own outputs reliably. If it does, the verification bottleneck weakens dramatically and the timeline compresses toward something much closer to Citrini’s scenario. Right now, verification looks harder to automate than execution, because it requires understanding context, intent, and consequence in ways current models handle poorly. But this is an empirical question. It could resolve in either direction within two to three years.

Institutional responsiveness matters enormously. The revised scenario assumes twelve to thirty-six months of policy delay, which is consistent with historical precedent. But if political polarization or regulatory capture extends that delay, the costs mount and the picture starts looking more like Citrini’s.

Physical automation could change the picture substantially. Both frameworks focus on cognitive work. If robotics advances faster than expected, displacement extends to manual and service occupations that both models treat as relatively safe, and the aggregate numbers get worse.

Conclusion

The Citrini scenario and the verification framework aren’t just competing predictions. They’re competing models of how AI plugs into the economy, and the choice between them shapes what risks you prepare for and what interventions you think will work.

Citrini captures something real: the distributional pain of AI-driven productivity gains, the vulnerability of business models built on friction, the adoption dynamics that make collective restraint impossible, and the genuine hardship of workers displaced from roles they expected to last a career.

But the macroeconomic conclusion — a deflationary spiral driven by uncontrolled substitution — rests on a labor-substitution model that’s incomplete, an institutional-inertia assumption that doesn’t hold up historically, and a closed-system model of value flow that ignores real channels of recirculation. The verification framework identifies the bottleneck that governs the pace and pattern of displacement, predicts new economic activity around verification and certification, and generates a timeline more consistent with how institutions actually behave under stress — which is badly, but not inertly.

The transition will be painful, uneven, and politically destabilizing. On that, Citrini is right. But it will be slower, messier, and more responsive to institutional intervention than the demand-collapse model assumes. The binding constraint isn’t a shortage of demand in a world of abundant intelligence. It’s a shortage of trust, in a world where trust has to be earned one institution and one liability framework at a time.

The canary is still alive. It’s singing about verification.