The build-versus-buy decision for AI is usually presented as a technology question and is almost never one. It is a question about where your durable competitive advantage actually sits and whether a given capability sits on top of it or beside it. Mid-market enterprises, with revenues between roughly 100m and 2bn, get this wrong in a predictable direction: they build the things they should have bought, because building feels strategic, and they buy the things they should have built, because the vendor demo was good. Both errors are expensive and both are avoidable with a framework applied before anyone writes a line of code or signs a contract.
The rule we give clients is short: buy the boring, build the differentiating. Boring does not mean unimportant. Document extraction, transcription, translation, generic chat interfaces and standard classification are boring in the precise sense that doing them slightly better than a competitor confers no advantage your customers will ever notice. Differentiating means the opposite: a capability where being meaningfully better changes what you can charge, who you can serve, or how fast you move. Almost everything in the first category should be bought. A small number of things in the second category justify building. The art is in the classification, and most of the cost is in getting it wrong.
The four-question test
Before any build-versus-buy decision, we ask four questions in order. Does this capability touch our proprietary data or process in a way a vendor cannot replicate? Would being materially better at this change our economics or our competitive position? Can we realistically staff and sustain it for three years, not just ship a first version? Is there a credible vendor whose product already does 80 percent of what we need? The pattern of answers decides the call far more reliably than any feature comparison.
If the first two answers are no, buy, regardless of how interesting the engineering looks. If the first two are yes and the fourth is no, build, because nothing off the shelf will capture the advantage. The genuinely hard cases are where the first two are yes and a credible vendor exists. There, the deciding question is the third: sustainability. A capability you can build but not sustain is worse than one you buy, because you will end up maintaining a bespoke system with a team that has moved on, which is the most expensive position of all.
Note what this framework deliberately ignores: the relative cost of the initial build versus the first year of licence fees. That comparison is where most business cases start and it is close to irrelevant, because it captures the smallest part of the real cost on either side.
The total-cost-of-ownership traps
Build costs are understated because teams price the first version and ignore the rest. The model you build against will be deprecated, and re-validating against its replacement is recurring work, not a one-off. The evaluation harness, the monitoring, the access controls and the incident process are not optional extras, they are the majority of the lifetime cost, and they do not appear in the prototype that won the funding. As a planning figure, the build is rarely cheaper than its first estimate and the ongoing run cost frequently exceeds the original build cost within two years.
Buy costs are understated differently. The licence is visible; the integration, the data preparation, the change management and the lock-in are not. Per-seat or per-call pricing that looks reasonable at pilot scale can become the largest line in the budget at full deployment, and you discover this after the workflow depends on it. Before signing, model the cost at full production volume, not pilot volume, and read the exit terms: what happens to your data, your prompts and your fine-tuning when you leave, and how long migration would take. A vendor who cannot answer the exit question cleanly is quoting you a higher price than the one on the contract.
The honest version of both estimates usually narrows the apparent cost gap and shifts the decision back to where it belongs: strategic fit rather than first-year price.
The middle path most teams miss
Build-versus-buy is presented as binary and rarely is. The most defensible position for mid-market firms is to buy the foundation and build the thin layer where the advantage lives. Buy the model, the vector store, the orchestration and the monitoring as commodity infrastructure. Build only the retrieval logic, the domain prompts, the evaluation set and the workflow integration that encode your specific process and data. This is build in the sense that matters, your advantage is in your code, while leaving the heavy, undifferentiated infrastructure to vendors whose entire business is maintaining it.
This composition keeps the surface area you must sustain small enough to actually sustain. A two-to-four-person team can own a thin differentiating layer for years. The same team cannot own a full stack, and the firms that tried are the ones now running bespoke infrastructure they cannot afford to maintain or replace. Decide what you are building at the layer of the advantage, not the layer of the whole system, and the sustainability question, which is usually the one that decides the case, becomes answerable.
