April 18, 2026 · Jeff Rogers

The Economics of Authored vs Inferred

Agent-first products pay to re-derive the same logic on every event. Substrate products pay once when the rule is authored. The cost gap is 15-20x at steady state and widens as the business grows.

There is a pitch going around that sounds irresistible: "Describe what you want, the agent figures out the rest. You don't have to think hard." For personal productivity — your email, your research, your calendar — that pitch is genuinely true. For business operations, it is a trap, and the trap is economic.

I want to show the math, because the cost argument is the one almost nobody is making, and it is the one that decides which of these products survive past Series A.

What actually costs tokens

In an agent-first product, every event goes through the LLM. The pattern looks like this: read the current state, reason about what the business rules are (they live in the prompt, or a retrieval pipeline, or a memory layer), pick an action, call a tool, observe the result, maybe loop. Every routing decision, every SLA comparison, every capacity check, every escalation choice re-derives the same logic from scratch.

This is the dirty secret. The LLM is doing work that is identical on every run. The rules for who gets a lead, what counts as late, when to escalate — these don't change between Tuesday morning and Tuesday afternoon. But the agent has no memory of them as structure. It has them as context, which means the model has to reason through them every single time an event fires.

In a substrate product, the picture is different. Roughly 80% of events in a typical operator tenant are deterministic — a policy matches a trigger, an action fires, a ledger entry writes. Nothing goes through an LLM. Routing is a query. SLA check is a timestamp comparison. Capacity is a count. The remaining 20% of events need perception — classify a photo, transcribe a voice note, extract fields from an inbound message — and those are narrow, bounded tasks a commodity model handles cheaply.

The rough math

Consider a 100-property residential services company doing about 150 events per day.

Agent-first:

~3 LLM calls per event (read context, decide, act)
~2,000 tokens in, ~500 tokens out per call
Reasoning-capable model (~$3 per million input, ~$15 per million output)
Daily cost: ~$6
Monthly: ~$180 per tenant

Substrate:

~0.2 LLM calls per event on average (only perception)
~500 tokens in, ~300 tokens out per call
Commodity classifier (~$0.80 per million input, ~$4 per million output)
Daily cost: ~$0.30
Monthly: ~$10 per tenant

That is a 15-20x cost gap at steady state, for the same operational throughput.

These numbers are directional, not precise. Specific models, specific prompts, specific retrieval strategies all shift the math. But the shape is robust. Any serious comparison between "LLM reasons on every event" and "LLM runs at the edges on 20% of events" produces an order-of-magnitude difference. You can argue about whether it is 8x or 30x. You cannot argue that it is comparable.

Why the gap widens

The important part isn't the steady-state number. It's what happens as the business grows.

Agent-first products scale cost with every new rule. A new policy means a longer prompt or more retrieval context. A new operator means more state to reason about. A new asset type means more nouns in every inference. Your best customers — the ones with the richest operations, the most rules, the most history — are also your most expensive to serve. The unit economics degrade with customer engagement, which is the opposite of what you want in SaaS.

Substrate products hold cost flat as rules grow. Adding a policy is inserting a row in a database. The 1000th policy evaluates in the same time as the 1st. Your best customers are your most profitable, because every rule they author is a sunk cost for the vendor and a durable asset for the customer. The moat compounds and the margin compounds.

This is why agent-first products end up with throttling, rate limits, per-action pricing, or "enterprise" tiers that are really cost-recovery mechanisms. The Manus team was explicit about hitting these walls in public. Any agent-first product at scale has either priced in the margin erosion or is about to discover it.

The hallucination story is the same argument, in risk form

Cost is not just tokens. It is blast radius when the LLM is wrong.

In an agent-first system, a hallucinated routing decision costs you a misassigned operator, a late response, a missed SLA, a lost customer. The LLM's error became a decision the business acted on before anyone reviewed it.

In a substrate system, the LLM only classifies inputs. If it gets one wrong — tags a photo incorrectly, extracts the wrong field from a message — the policy refuses to fire. The error is caught at the next gate, before it becomes a decision. The cost of being wrong is bounded because the authored layer is the one acting, not the model.

Where the "don't think hard" pitch is actually true

To be fair to the agent-first argument: for personal productivity, it wins. One person, one-off tasks, no compliance requirements, no team to keep consistent. "Help me with my email this afternoon" does not need an authored substrate. The agent's variability is not a bug — it is a feature, because what you want is help, not enforcement.

The error is extending that pitch to business operations. The moment more than one person has to run the same process, or the same customer expects the same experience regardless of who is on shift, or a regulator expects a consistent record, or an investor expects predictable cost per customer — the agent's "don't think hard" becomes "nobody thinks clearly." Every invocation re-derives the business from scratch, inconsistently, at full reasoning-model cost.

A business is not "my personal inbox at scale." A business is a thing that has to run the same way when the owner is not there, when an employee is new, when the model hallucinates, when the customer tries to sue. That is not what agents are good at. It is what authored substrates are good at.

The real pitch

The owner authoring policies in Runbook is not doing more thinking than they were doing before. They are doing the same thinking, in a place where it stops being repeated.

They were already telling new hires "don't schedule a showing without a pre-approval." They were already texting their team "remember to log the close count tonight." They were already explaining the same rule to the fifth employee in a year. That is policy authoring — it is just happening verbally, ephemerally, redundantly, at full cognitive cost every time.

Runbook lets them type the sentence once and be done. The ongoing cost — of thinking, of reminding, of re-training — drops to near zero.

This is the same pattern as the LLM cost argument, restated in cognitive terms. Agents distribute the thinking across time and people and cost it each time. Substrates capture the thinking once and evaluate it for free ever after.

Close

Agent-first products are in a race with their own unit economics. Every customer who gets more value from the product makes the product more expensive to run. The better it is, the worse the margins. That is not a business — it is a treadmill.

Substrate products have the opposite shape. Every customer who gets more value authors more rules, which costs the vendor nothing more and makes the customer stickier. Every new rule deepens the moat instead of eroding the margin.

The cost argument is not a rounding error. It is the difference between a business that survives and one that has to raise another round every eighteen months to keep paying the model provider. Builders paying attention to this right now are quietly choosing the substrate path because they ran the math and did not like what they saw.

Intelligence is expensive. Structure is cheap. The cheapest product is the one that only calls the model when nothing else can do the job — and that product is a rules engine with AI at the edges, authored by the owner, persistent, inspectable, and almost free to run.

This post is a companion to Architect Mode Needs a Substrate and Most AI Transformations Are Speeding Up the Wrong Thing. Together they argue the same thing from three directions: substrate (architecture), process (what dissolves), and cost (what it means for margin).