Verification Scarcity: A Systems Model of Agentic AI Constraints

Version 2.0 —


Preamble

This document emerged from constraint-based analysis of agentic AI deployment — not from literature review. The conclusions below follow from first principles: thermodynamics, information theory, computational complexity, and organizational economics. Where empirical data confirms the structural predictions, it is cited. Where empirical data is incomplete, unknowns are explicitly marked [UNKNOWN: description].

The framework is not a critique of AI capability. It is a derivation of what the physical and mathematical structure of these systems requires and forbids, and what the organizational conditions are under which reliable deployment is possible at all.


I. The Physical Layer: Thermodynamic Constraints

1.1 Landauer’s Floor

Every logical operation has a minimum energy cost. The lower bound is set by Landauer’s Principle:

Wmin=kBTln2W_{min} = k_B T \ln 2

Where kBk_B is Boltzmann’s constant and TT is temperature. This is not an engineering problem — it is a physical invariant. No architectural improvement eliminates it.

1.2 Operational Power Model

The total power cost of inference at scale:

Ptotal=(CfV2+IleakV)PUEP_{total} = (C \cdot f \cdot V^2 + I_{leak} \cdot V) \cdot PUE

As VV approaches the threshold voltage, IleakI_{leak} dominates, creating a hard physical floor on the cost per inference operation. Architectural improvements (sparse models, neuromorphic hardware, edge compute) shift the curve but cannot cross the floor.

1.3 The Jevons Paradox of Compute

Efficiency gains do not resolve the energy constraint. As inference cost per operation decreases, demand increases by a proportionally greater amount, producing net energy consumption growth:

dEtotaldη>0where η=efficiency\frac{dE_{total}}{d\eta} > 0 \quad \text{where } \eta = \text{efficiency}

The scale of this dynamic is now measurable. US data centers consumed approximately 4.4% of total national electricity in 2023 and are projected by the US Department of Energy’s Lawrence Berkeley National Laboratory to reach 6.7%–12% by 2028 (DOE/LBNL, 2024). Globally, the IEA projects data center electricity consumption will double to 945 TWh by 2030, with AI-focused data centers growing at 30% annually (IEA, Energy and AI, 2025). In 2025, data centers accounted for approximately 50% of all US electricity demand growth (IEA / Fortune, April 2026).

The Gigawatt Ceiling is therefore a demand-side constraint as much as a supply-side one. Infrastructure buildout accelerates utilization faster than grid capacity scales.


II. The Capital Layer: Hardware Obsolescence Cascade

2.1 The Stranded Asset Mechanism

Data center capital is financed on mismatched amortization schedules:

When next-generation architecture renders current inference hardware economically uncompetitive before debt service completes, the asset’s revenue-generating capacity falls below its financing cost. The debt does not obsolete with the hardware.

2.2 Capital Entropy

Let A(t)A(t) be the economic value of deployed hardware at time tt, and D(t)D(t) the outstanding debt obligation:

Ucapital(t)=A(t)D(t)U_{capital}(t) = A(t) - D(t)

When architectural obsolescence drives A(t)A(t) below D(t)D(t) before the amortization schedule completes, Ucapital(t)<0U_{capital}(t) < 0.

The scale of the exposure is significant. Goldman Sachs’ baseline model projects $765 billion in annual AI CapEx in 2026 across compute, data centers, and power, growing toward $1.6 trillion annually by 2031 (Goldman Sachs, April 2026). Big-5 hyperscaler spending alone reached approximately $725 billion in 2026 following Q1 earnings revisions (CFA Analysis, April 2026). IEA notes that five large technology companies surged capex to over $400 billion in 2025 and set it to increase by a further 75% in 2026 (IEA, April 2026).

At these investment magnitudes, even a modest debt-financed fraction at 7-year schedules against a 2–3 year economic obsolescence horizon produces stranded exposure in the hundreds of billions USD.

Historical analogs: Telecom dark fiber overbuild (1999–2001); shale debt structured at $100/bbl against collapsed oil prices. Both produced cascading non-performance of loans and destruction of lender balance sheets.

[UNKNOWN: Precise debt-financing fraction of 2026 AI infrastructure capex and lender concentration — required to size the cascade exposure accurately]

2.3 Interaction with Thermodynamic Constraint

Organizations servicing stranded hardware debt while simultaneously absorbing exponential verification overhead (Section IV) face a bilateral cost squeeze: the capital cost of past infrastructure and the operational cost of present verification both compound against revenue.


III. The Reliability Layer: Geometric Decay

3.1 Base Model

For an nn-step agentic workflow where each step has independent success probability pp, system reliability is:

R(n)=pnR(n) = p^n

This is a geometric series with no floor above zero. At p=0.99p = 0.99, n=100n = 100:

R(100)=0.991000.366R(100) = 0.99^{100} \approx 0.366

A 99%-accurate agent executing a 100-step workflow produces reliable output only 36.6% of the time. This is not an edge case — it is the central operating reality of any sufficiently complex agentic pipeline.

Deployment data confirms the structural prediction. Gartner projects over 40% of agentic AI projects will be canceled by end of 2027 due to escalating costs, unclear ROI, or inadequate risk controls (Gartner, June 2025). Separately, Forrester and Anaconda 2026 data show 88% of agent pilots failing to reach production (Digital Applied, April 2026). Meanwhile, PwC’s 2026 AI Performance Study of 1,217 senior executives confirms that 74% of AI’s economic value is captured by just 20% of organizations, with the majority still trapped in pilot mode (PwC, April 2026).

3.2 The Precision Requirement

To maintain a target reliability RtargetR_{target} as complexity nn increases, the required per-step precision is:

prequired=eln(Rtarget)np_{required} = e^{\frac{\ln(R_{target})}{n}}

As nn grows, prequired1.0p_{required} \to 1.0. The precision requirement imposed by complexity scales faster than any realistic model improvement trajectory.

3.3 Modified Model: Checkpointing with Correction

Insert mm verification checkpoints at intervals of n/mn/m steps, each with correction probability vcv_c:

Rcheckpoint=[1(1pn/m)(1vc)]mR_{checkpoint} = \left[1 - (1 - p^{n/m})(1 - v_c)\right]^m

Key result: As vc1.0v_c \to 1.0, Rcheckpoint1.0R_{checkpoint} \to 1.0 regardless of nn. Checkpointing with high-fidelity correction breaks geometric decay.

Critical constraint: vcv_c is the expensive variable. It requires a human verifier with sufficient domain expertise to distinguish a correct output from a plausible-but-wrong one. Low-authority or low-expertise verification returns vc0v_c \to 0, collapsing RcheckpointR_{checkpoint} back toward pnp^n.

3.4 Modified Model: Parallel Redundancy

Run kk independent chains, accept best or majority result:

Rparallel(k)=1(1pn)kR_{parallel}(k) = 1 - (1-p^n)^k

At k=3k = 3, base R=0.366R = 0.366 improves to Rparallel0.745R_{parallel} \approx 0.745. Cost scales linearly with kk. Parallel redundancy buys reliability at proportional compute cost but does not break the underlying decay.

3.5 Unified Reliability Model

Combining checkpointing and redundancy:

Rnet=[1((1pn/m)(1vc))k]mR_{net} = \left[1 - \left((1 - p^{n/m})(1 - v_c)\right)^k\right]^m

At each of mm checkpoints, kk parallel chains each fail and go uncorrected with probability (1pn/m)(1vc)(1 - p^{n/m})(1 - v_c). The checkpoint is lost only if all kk chains fail uncorrected: ((1pn/m)(1vc))k\left((1 - p^{n/m})(1 - v_c)\right)^k. This is consistent with the correction model in Section 3.3; vcv_c remains an independent correction event, not a multiplier on per-step success probability.

The only lever that breaks geometric decay without proportionally scaling compute cost is vcv_c. All other optimizations (parallelism, checkpointing without correction) are cost multipliers on the same degrading base. The escape from reliability decay is not an engineering problem — it is an organizational one.

3.6 Time-Variant Precision Decay

pp is not static. As models are trained on increasing volumes of agent-generated synthetic data, the Kullback-Leibler divergence between the training distribution and ground truth grows:

DKL(PQ)=xXP(x)ln(P(x)Q(x))D_{KL}(P \| Q) = \sum_{x \in X} P(x) \ln\left(\frac{P(x)}{Q(x)}\right)

Where PP is the ground-truth distribution and QQ is the synthetic-data-contaminated training distribution. As DKLD_{KL} \to \infty, the model loses grounding in physical reality (stochastic drift). This adds a time derivative to pp:

dpdt=γDKL(t)\frac{dp}{dt} = -\gamma \cdot D_{KL}(t)

Where γ\gamma is the contamination rate. The geometric decay in Section 3.1 therefore accelerates over time independent of chain length. Reliability is a function of both nn and tt.

[UNKNOWN: Empirical measurement of γ for current production model families — requires longitudinal benchmark tracking against verified ground-truth datasets]


IV. The Verification Layer: Cost Asymmetry

4.1 The Fundamental Asymmetry

Generating an agentic output is computationally O(poly)O(poly). Verifying that output — particularly detecting plausible-but-wrong results — approaches NPNP-hard for sufficiently complex outputs. The generation-verification cost ratio is therefore not fixed; it worsens as output complexity increases.

Formally, verification cost CvC_v as a function of output complexity and human cognitive bandwidth BB:

Cv=0TComplexity(O)BdtwheredCvdn>dCgendnC_v = \int_{0}^{T} \frac{\text{Complexity}(O)}{B} \, dt \quad \text{where} \quad \frac{dC_v}{dn} > \frac{dC_{gen}}{dn}

Verification cost grows faster than generation cost as task complexity increases. This is the structural ceiling on agentic ROI.

4.2 The Admissibility Gap

In high-stakes domains, outputs must be not merely accurate but auditable — bound to a deterministic evidence chain. The gap between outputs that appear audit-shaped (citations, professional prose, specific numbers) and outputs that are actually admissible (bound to verifiable, resolvable evidence) is the Admissibility Gap.

AI systems produce audit-shaped outputs at high volume. The human cost of determining admissibility scales with volume. At sufficient volume, the verification budget is exhausted and admissibility checking becomes stochastic — which means high-credibility errors travel further before detection.

4.3 Verification Budget as Finite Resource

Holding a community’s verification capacity fixed, any increase in agentic output volume OO mechanically dilutes verification per claim V/OV/O:

d(V/O)dO<0\frac{d(V/O)}{dO} < 0

This is not a resourcing problem that scales away with hiring. Verification requires domain expertise with long formation timelines. The resource is structurally scarce.


V. The Organizational Layer: Authority-Expertise Decoupling

5.1 The Principal-Agent-AI Three-Node System

Traditional principal-agent problems involve two nodes: Principal (management) → Agent (employee). Agentic AI introduces a third: Principal (management) → Expert (verifier) → Agent (AI).

When the Expert is denied decision authority, Verification Latency τ\tau is introduced. Total task cost:

Ctotal=Cgen+i=1kΨ(Li,σi)(1+τ)iC_{total} = C_{gen} + \sum_{i=1}^{k} \Psi(L_i, \sigma_i) \cdot (1 + \tau)^i

Cost grows exponentially with kk. At k=0k = 0 (authority co-located with expertise at the verification point), CtotalCgenC_{total} \to C_{gen}.

5.2 The Equivalence: Authority Index = Correction Probability

The precision decay model (Section 3.3) and the organizational cost model (Section 5.1) express the same constraint in different frames.

Let α[0,1]\alpha \in [0,1] be the Authority Index — the degree to which a verifier has decision authority over the output they are evaluating.

vc=f(α)wheredvcdα>0,vc(α=0)0,vc(α=1)vmaxv_c = f(\alpha) \quad \text{where} \quad \frac{dv_c}{d\alpha} > 0, \quad v_c(\alpha=0) \to 0, \quad v_c(\alpha=1) \to v_{max}

When α=0\alpha = 0 (the expert is an observer with no authority), the incentive to perform high-fidelity verification collapses (moral hazard). The precision decay equation:

dpdt=λ(1α)\frac{dp}{dt} = -\lambda(1 - \alpha)

At α=0\alpha = 0: dp/dt=λdp/dt = -\lambda — maximum decay. At α=1\alpha = 1: dp/dt=0dp/dt = 0 — decay halted by motivated expert correction.

The equivalence: vcv_c in the reliability model and α\alpha in the organizational model are the same variable. Organizational structure is not a soft consideration adjacent to the technical reliability problem. It is a direct input to RnetR_{net}.

5.3 The Fundamental Invariant

When decision authority is co-located with domain expertise at the point where information and verification intersect, verification latency τ0\tau \to 0, correction probability vcvmaxv_c \to v_{max}, and the reliability model escapes geometric decay.

This invariant holds wherever authority-expertise co-location is achieved — across domains, organizational sizes, and industries. The specific organizational form is an instance of the invariant, not the invariant itself. The invariant is the thing.

5.4 Adverse Selection of Output

Organizations with α0\alpha \to 0 (authority-expertise decoupling) do not simply fail to verify — they systematically select against the outputs most in need of expert judgment. High-complexity, high-value agentic outputs require the most expertise to evaluate. Without authority at the expertise level, organizations filter these out in favor of low-complexity, easily-signable outputs.

Result: the measurable productivity gains from agentic AI accrue to organizations with α1\alpha \to 1 and disappear into verification overhead for organizations with α0\alpha \to 0. The PwC 74/20 split is the empirical expression of this selection effect.


VI. The Integrated System Model

6.1 Net Utility Function

Agentic system viability over time:

Unet(t)=V(A,D)[Φ(E)+Ψ(L,σ)+Ω(M)]U_{net}(t) = V(A, D) - \left[\Phi(E) + \Psi(L, \sigma) + \Omega(M)\right]

Term Definition
V(A,D)V(A, D) Gross value: function of Autonomy AA and Data Fidelity DD
Φ(E)\Phi(E) Energy cost: Einference×Price/kWh×PUEE_{inference} \times \text{Price/kWh} \times PUE
Ψ(L,σ)\Psi(L, \sigma) Verification cost: labor LL against uncertainty σ\sigma
Ω(M)\Omega(M) Maintenance: hardware amortization and model retraining

System survives only if Unet(t)>0U_{net}(t) > 0. The stranded asset cascade (Section II) adds a time-indexed debt service term D(t)D(t) to the cost side, further compressing the window of viability for organizations carrying hardware debt against accelerating obsolescence.

6.2 The Trust-Autonomy Duality

Autonomy and reliability are coupled, not independent:

dAdt=k1(V)k2(σ)\frac{dA}{dt} = k_1(V) - k_2(\sigma)

dσdt=αc(n)DKL(t)β(H)\frac{d\sigma}{dt} = \alpha_c(n) \cdot D_{KL}(t) - \beta(H)

As DKLD_{KL} grows (model contamination) and α\alpha falls (authority decoupling), σ\sigma increases without bound. Autonomy growth dA/dtdA/dt is eventually overwhelmed by entropy growth.

6.3 The Two Interacting Entropic Systems

Two distinct decay processes interact multiplicatively:

System Mechanism Metric
Agentic Entropy ΔSagent\Delta S_{agent} Agents optimize for local correctness, eroding global architectural intent Stochastic drift: local success masks global failure
Cognitive Debt ΔDcog\Delta D_{cog} Human supervisors lose system-level mental model as AI velocity exceeds comprehension bandwidth Oversight collapse: loss of capacity to detect the next wave of entropy
Interaction ΔSagent\Delta S_{agent} increases opacity → deepens ΔDcog\Delta D_{cog} → prevents detection of next ΔSagent\Delta S_{agent} Non-linear amplification: each system’s failure accelerates the other

The interaction term is multiplicative, not additive. This follows directly from the vcv_c framework in Section 5: undetected error accumulation = errors generated × (1 − correction probability). Correction probability vcv_c degrades as cognitive debt ΔDcog\Delta D_{cog} increases. Therefore:

Undetected accumulationΔSagent×ΔDcog\text{Undetected accumulation} \propto \Delta S_{agent} \times \Delta D_{cog}

This is a mathematical identity given those definitions — not an empirical assertion. If either factor is zero (no entropy generated, or full correction capacity intact), the compound failure mode does not occur. Additive formulations lack this property: they produce nonzero total entropy even when one system is at zero, which is not consistent with the coupling mechanism described above.

The same multiplicative structure appears across complex systems failure literature: Reason’s Swiss Cheese Model (1990), Perrow’s Normal Accidents (1984), and Shannon’s noisy channel (error rates compounding multiplicatively through chained channels). All share the same underlying form: hazard introduction rate × probability of escaping detection = multiplicative, as a structural identity.

ΔStotal=ΔSagent×ΔDcog\Delta S_{total} = \Delta S_{agent} \times \Delta D_{cog}

As agentic entropy makes the system more complex and opaque, the cognitive debt of the verifier deepens. Deepened cognitive debt prevents detection and correction of the next entropy wave. The loop is self-reinforcing and accelerates without an external corrective force — which is, again, vcv_c: expert authority at the verification point.


VII. The Labor Value Inversion

7.1 The Counter-Narrative Emergent

The dominant public frame asserts that AI displaces knowledge workers. The model produces the opposite structural conclusion.

vcv_c — the only lever that breaks geometric reliability decay without proportionally scaling cost — requires three things that cannot be automated away:

Domain expertise sufficient to distinguish correct from plausible-but-wrong output. An agent cannot verify another agent’s output against ground truth it doesn’t possess. Only a human with domain expertise can close this loop.

Systems thinking sufficient to detect local correctness masking global architectural failure. This is the Cognitive Debt problem inverted: the same capability that is destroyed by AI velocity in organizations with α0\alpha \to 0 becomes the irreplaceable asset in organizations with α1\alpha \to 1.

Intuition — the pattern recognition capability that operates below the threshold of articulable rules — sufficient to identify when an output is audit-shaped but inadmissible. This is precisely what long formation in a domain builds and what no training run replicates, because it is grounded in physical and social reality, not in the token distribution of prior outputs.

These are not peripheral skills. They are the structural inputs to the only escape from R(n)=pnR(n) = p^n.

7.2 The Empirical Confirmation

Software engineering roles in 2026 confirm the prediction. Senior engineers with systems judgment are increasing in market value. Junior developers whose primary function is producing outputs that look correct are being restructured out — replaced by agents that produce the same class of output at lower marginal cost.

This is not a talent market fluctuation. It is the labor market expressing the mathematical constraint. The model predicted it before the market showed it. The same dynamic will propagate through any domain characterized by high nn (complex multi-step workflows), high DKLD_{KL} risk (models operating far from verified ground truth), and currently low α\alpha (authority-expertise decoupling). Healthcare, law, engineering design, and financial analysis are the next wave.

7.3 The Formal Statement

Let VlaborV_{labor} be the market value of a labor input:

Vlaborvc(α,expertise,systems judgment)V_{labor} \propto v_c(\alpha, \text{expertise}, \text{systems judgment})

Labor value is a direct function of verification capability under authority. As agentic output volume increases across the economy, vcv_c-capable labor becomes scarcer relative to demand, increasing its price. Simultaneously, labor whose primary output is indistinguishable from agentic output — pattern-matching, first-draft generation, routine summarization — is displaced.

The inversion is not symmetric. The increase in value for vcv_c-capable labor is driven by the mathematical structure of the reliability problem, which gets harder as agentic deployment deepens. The displacement of substitutable labor is driven by cost pressure. Both are mandatory outcomes of the same underlying system.


VIII. Structural Conclusions

1. Reliability decay is mathematically required for any agentic pipeline of sufficient complexity. No model improvement escapes R(n)=pnR(n) = p^n without external correction. The escape requires vc>0v_c > 0, which requires expert authority. This is a structural necessity, not a design choice.

2. Verification cost grows faster than generation cost as task complexity increases. This is the structural ceiling on agentic ROI, and it is not resolvable by scaling compute.

3. Organizational structure is a direct input to system reliability, not a management consideration adjacent to it. vc=f(α)v_c = f(\alpha). Authority-expertise co-location is a technical requirement derivable from the reliability model.

4. Two entropic systems interact multiplicatively. Agentic entropy and cognitive debt amplify each other non-linearly. The only corrective force is expert authority at the verification point.

5. Hardware capital is being structured on mismatched timelines. Economic obsolescence is outpacing amortization schedules at investment magnitudes ($765B+ annual AI CapEx in 2026) that will produce non-performance cascades in the hundreds of billions USD. [UNKNOWN: Precise exposure size pending debt-financing fraction data]

6. Model contamination adds a time derivative to base precision. pp is not static; it degrades as synthetic training data accumulates. Reliability is a function of both pipeline complexity nn and time tt. [UNKNOWN: Empirical γ for production model families]

7. Labor value inverts against the dominant narrative. The mathematical structure requires expertise, systems thinking, and intuition — precisely the capabilities the displacement narrative treats as vulnerable. The market is already expressing this in software roles, and will propagate through every high-nn, high-DKLD_{KL} domain.


IX. Open Variables

Unknown Description Resolution Path
γ\gamma Contamination rate: speed at which synthetic training data degrades base precision pp Longitudinal benchmark tracking against verified ground-truth datasets
Debt exposure Precise debt-financing fraction of 2026 AI infrastructure capex and lender concentration Financial disclosure analysis; structured finance data
vmaxv_{max} Ceiling on correction probability achievable by human expert under full authority Empirical study of expert-in-loop system performance at high α\alpha
China capability trajectory Architectural efficiency under chip access constraints; interaction with internal social contract dynamics Partially inferrable from public output (DeepSeek, Kimi efficiency gains); military/intelligence application trajectory is not resolvable from open sources

Document captures the structural model as of 2026-05-02. Mathematical additions and empirical calibration of open variables to follow as data emerges.