Enterprise AI/ML Needs Data Management. And Vice Versa.

Artificial intelligence (AI) and machine learning (ML) are powering the digital transformations happening in every industry around the world. AI/ML is critical in discovering new therapies in life sciences, reducing fraud and risk in financial services, and delivering truly personalized customer experiences. 

Enterprise AI/ML Needs Data Management

The success of AI is dependent on the effectiveness of the models designed by data scientists to train and scale it. And the success of those models is dependent on the availability of trusted and timely data. Steve “Guggs” Guggenheimer, Microsoft’s corporate vice president of AI, succinctly captured this relationship between data and AI recently at our 20th Informatica World®: “You can’t have intelligent AI conversations until you first have intelligent data conversations.” This view has been echoed by many experts, including Forbes and Harvard Business Review


Why do data scientists tasked with building AI/ML models need high-quality data? As an example, let’s discuss a prediction model tasked with anticipating a consumer’s behavior. A valuable feature for such a model could be consumer location as indicated by the postal ZIP code. But what if the ZIP code data is missing, incomplete, or inaccurate? The model’s behavior will be adversely affected both during training as well as during deployment, which could lead to incorrect predictions and reduce the value of the entire effort.

In addition, an accurate, complete, and verified ZIP code could also help to predict an individual’s market segmentation, income class, age, life expectancy, and more — all the more reason to get it right.

We should all expect “explainable AI” to become a regulated mandate, not just an option. Without metadata-driven lineage and traceability, AI-powered applications and insights cannot be deployed into production.

Data Management Needs AI

AI/ML also plays a critical role in scaling the practices of data management.

Due to the massive volumes of data needed for digital transformation, organizations must discover and catalog their most relevant data and metadata to certify the relevance, value, and security — and to ensure transparency. They must cleanse and master this data. And they must effectively govern and protect this data.

If data is not managed effectively — and to scale — AI/ML models will suffer the same fate as every traditional data warehousing initiative over the past 30 years: use poor-quality data, deliver untrustworthy insights.

That’s where AI/ML for data management comes in, and why we have focused our innovation investments so heavily on our CLAIRE engine, Informatica’s metadata-driven AI capability. CLAIRE leverages all enterprise unified metadata to automate and scale routine data management and stewardship tasks.

We had 2,600 attendees at Informatica World and CLAIRE was everywhere, and I can tell you that intelligent conversations about data quickly turned into intelligent conversations about AI.

I’d love to hear where you are on your data and AI journey!