While I hate saying ‘I told you so,’ I’m going to say it anyway: I told you so.
What did I tell you? For year after year, we’ve been saying how important it is to manage, classify, and govern your data so you can drive business value from it.
And now? With your board asking, ‘Why aren’t we diving into AI?’ and ‘How come this AI project seems to be stalling?’, you’re remembering all those times that your proposed data projects were given the no-go, priorities changed, or those involved just ran out of time, enthusiasm, and resources.
Which now leaves you with AI projects that struggle to scale and deliver the ROI you were promised. While it’s easy to blame the AI initiative and execution for this failure to deliver, the problem is rarely the AI models; it’s almost always your data.
During our recent AI roadshow, attendees from Christchurch to Adelaide all echoed the same frustrations. They described proofs of concept that dazzled in the lab but fizzled when exposed to the complexity of the wider enterprise. The recurring culprits were depressingly familiar: fragmented data living in incompatible systems, inconsistent quality that erodes trust, and governance gaps that leave their compliance officers nervous. Now, these issues aren’t new. They plagued business-intelligence projects ten years ago and digital transformation programmes five years ago. AI simply magnifies the cost of ignoring them.
So, what data ducks need to be in a row in the here and now?
Data fragmentation is the first hurdle. For example, you may have separate versions of each customer in CRM, ERP, and a legacy logistics database. When a machine-learning model tries to predict churn or recommend cross-sell offers, it can’t reconcile “Jane White” in one application with “J. White” in another. The insight engine stalls before it leaves the gate. Next comes quality. Duplicated, incomplete, or outdated records create an environment where no one is certain which numbers to trust. At human speed, analysts can spot and correct errors. At machine speed, flawed inputs turbocharge flawed outputs. Finally, governance (or the lack thereof) turns minor data problems into enterprise risks. When no reliable catalogue shows where personally identifiable information lives, or when role-based access rules are patchy, an innovative use case can mutate into a damning privacy headline overnight.
Yet there’s some good news when it comes to AI initiatives and data - you don’t need immaculate data estates to achieve meaningful AI returns. The secret is to tie each initiative to a tightly bound use case with clear and achievable data requirements.
For example, a construction company can use a generative AI model to analyse architectural drawings. AI will identify every door, window, and truss, producing an 80%-accurate bill of materials in minutes. Because the required data is contained in the drawings themselves, the firm can bypass a multi-year data-cleansing exercise. A quantity surveyor reviews and amends the output, the model learns from the corrections, and the business banks immediate savings.
A second example is a retail bank that automates a large portion of its mortgage-approval workflow by allowing an AI agent to ingest passports, driver's licences, and six months of bank statements. The model classifies income and expenses, flags anomalies, and proposes an affordability score. A credit officer makes the final decision, but what once took days of manual review now happens in under an hour. Again, the scope is tightly controlled, the data set is finite, and the return is undeniable.
These quick wins serve a strategic purpose beyond their direct financial impact: they surface the governance conversations you need to address before you can move on to more ambitious goals. When your leaders see a path to value, they’re more willing to invest in the less glamorous foundations - cataloguing data sources, standardising definitions, and instituting the consistent security controls needed for the next wave of innovation.
That next wave comes when the business wants a holistic view of the customer, a predictive maintenance engine that covers an entire fleet, or an AI-driven close process across all legal entities.
It’s at this stage that “good enough” data hygiene is no longer enough (and here you can really take my “I told you so” comment to heart). Your enterprise has to agree on a single source of truth for customers, products, and assets; maintain a searchable inventory that flags sensitive fields and lineage; and enforce role-based access so that both humans and AI agents see only what they are entitled to see. Equally important, you must embed “trustworthy AI” principles, such as fairness, explainability, and transparency, from the first line of code to the last mile of deployment. Why? Because regulators won’t care whether bias or leakage was accidental, and your shareholders will be indifferent as to whether a reputational hit came from a third-party model.
We know that some executives worry that such governance demands will slow innovation. But in practice – and based on our experience, the opposite occurs. When data is catalogued and access rules are unambiguous, your teams spend less time negotiating for permissions and more time building solutions. A clear framework also supports the “fail fast” culture that digital leaders prize. If a new idea proves unworkable, you find out early – before you’ve sunk time, money and effort into cleaning data that the use case never needed.
The path forward? We suggest a two-speed approach.
In the near term, sponsor compact, high-value pilots that rely on data you already trust. Measure the value in hard currency - hours saved, revenue captured, risk avoided - and communicate the result widely. In parallel, invest in the structural work that will let you scale. This involves appointing data stewards for each domain, funding a modern catalogue and classification platform, and aligning leadership incentives with clearly defined “data readiness” targets. Treat those targets with the same seriousness you apply to cost-of-capital or your net-promoter scores; they are becoming equally determinant of enterprise value.
Artificial intelligence is no longer a science project. It’s the real deal. The technology works, commercial tools abound, and your competitors are moving forward – even if you aren’t. The differentiator now is operational discipline - having your data ducks in a row, not strewn across silos; embedding governance into design, not bolting it on after a headline; identifying use cases where value is provable in quarters, not years.
Those who delay will watch their competitors streak ahead. Whereas if you master these data basics, you will convert AI hype into profit. And at that point, I’ll be happily (if not a little smugly) saying, “I told you so - I knew sorting out your data would pay off!”