A project team measures the lift it observed. A CFO has to measure the cashable change to the company's financial position, net of cost and counterfactual, in a way that survives people whose job is to disbelieve it. Most AI ROI claims are built with the first discipline when they need the second.
Teams measure what they can see, a demo-condition lift, and skip the hard parts: a pre-registered baseline, the counterfactual, fully-loaded run-cost, and the translation from an operational metric to a cashable line on the income statement.
Funded initiatives that genuinely work get defunded because they were reported as expense, not as a unit economic that beat the human baseline. Capital is misallocated toward the loudest demo rather than the best return.
The recurring cost of production AI is real (inference, retrieval, review, monitoring) and compounds. Without a model, you discover at full volume that the unit economics never held, after the workflow already depends on it.
A one-off spreadsheet ROI model is normal, and its limits are predictable: the baseline is missing or measured differently from the post-AI number; the counterfactual is ignored, so the whole delta is wrongly attributed to AI; run-cost is understated; and time saved is counted as cash even when it is reabsorbed into the working day with no headcount or revenue consequence.
A finance-grade method: pre-register the baseline using the exact metric you will report, control the counterfactual with a holdout where affordable, load the full recurring run-cost, and report cost per unit of business value against the human baseline. State a defensible range, not a false-precision figure.
We bring the method and the model. Our partners baseline, attribute and net out run-cost so the claim survives your audit committee, and we report cost per resolved task, not cost per token. The free ROI calculator gives you the headline today; we make it board-grade.
It measures a controlled improvement, not the company's financial position net of cost and counterfactual. That gap is exactly what an audit committee will challenge.
Widen the confidence interval, name the confounders you could not control, and plan on the conservative end. A defensible range beats a fragile point estimate that collapses on challenge.
Fully-loaded run-cost at full volume. The token bill is usually the smallest part, behind retrieval, human review, monitoring and maintenance.