When an enterprise AI programme stalls, the post-mortem usually reaches for the model: the wrong one was chosen, the prompts were weak, the vendor underdelivered. In our experience the real cause is almost always upstream of the model entirely. The system could not get reliable, timely, governed access to the data it needed, because that data lived in systems built years or decades before anyone intended to read them programmatically at scale. The model was never the bottleneck. The thirty-year-old policy administration system with no API was the bottleneck, and no amount of model selection fixes that.
This is uncomfortable because it reframes an AI programme as, in large part, a modernisation programme. The exciting work of choosing and tuning models sits on top of the unglamorous work of making enterprise data accessible, current, and trustworthy through interfaces something else can consume. Organisations that skip the second and fund only the first produce a string of impressive prototypes on extracted sample data that cannot be productionised, because production requires the live data the prototype never had to touch. The modernisation is not a precondition you can defer. It is the project.
Why AI raises the stakes on data access
Legacy integration was always painful, but its consequences were contained. A nightly batch export feeding a quarterly report could be a day stale and slightly wrong and the damage was a footnote. The same data, read live by an AI system that customers or clinicians interact with in real time, turns staleness and error into confidently delivered wrong answers at the moment of decision. The tolerance for bad data access collapses precisely as AI makes the data more consequential, and systems that were tolerable as analytical sources become liabilities as AI sources.
The other shift is from periodic to continuous and from human-mediated to machine-mediated. A human analyst pulling a report applies judgement, notices when a number looks wrong, and silently compensates for the system's quirks. An AI system reading the same source has none of that judgement and will faithfully propagate whatever the source hands it. This is why 'we already have integrations' is rarely the reassurance teams think it is: those integrations were built for human-supervised, latency-tolerant reporting, and AI needs the opposite, which is machine-consumable, low-latency, contract-backed access that fails loudly rather than quietly.
The three modernisation moves that actually unblock AI
The modernisation work that matters reduces to three moves, in rough order of leverage. The first is APIs over systems of record: a stable, documented, access-controlled interface in front of the legacy system, so consumers stop reaching into its internals and start calling a contract that can be versioned and governed. This is frequently the highest-leverage single investment, because it decouples everything downstream from the legacy system's schema and lets you modernise behind the interface without breaking consumers.
The second is data products: curated, owned datasets with a documented schema, an owner, a known consumer set, and a service level, published deliberately rather than scraped from a pipeline nobody maintains. A data product is a contract; a pipeline is a one-way commitment from a source you do not control to a target that did not ask for permission. AI systems need contracts, because they need to know the data will be there, current, and shaped as promised, and to be alerted when that breaks rather than discovering it through fabricated output. The third is event streams: publishing meaningful business events as they happen, so that AI systems can react to current reality instead of polling a stale snapshot, which is what most real-time AI use cases actually require underneath the demo.
Sequencing the work so it pays its own way
The objection to all of this is that modernisation is a multi-year programme and the business wants AI this year. The resolution is to sequence by use case rather than attempting to modernise everything first, which is how these programmes die of their own ambition. Pick the first genuinely valuable AI use case, identify the specific data it needs, and modernise exactly that access path: the one API, the one data product, the one event stream that unblocks it. The AI use case funds and justifies the modernisation, and the modernisation outlives the use case as reusable infrastructure the next one inherits.
Done this way the two programmes reinforce each other instead of competing for budget. Each AI use case pulls a slice of modernisation into existence and pays for it with a concrete result, and over a few cycles the organisation accumulates a genuinely modern, contract-backed data layer, built incrementally and justified at every step. The alternative we see repeatedly is a big-bang modernisation with no use case attached, which loses funding the moment budgets tighten, and a parallel stream of AI prototypes that never ship because the data layer they needed was the thing that got cut. Tie them together and both survive. Keep them apart and you tend to get neither.
