AI for Data Management: Making Sense of Vendor Claims

AIIt seems there is a new, hyperbolic vendor announcement about applying Artificial Intelligence and/or Machine Learning (AI/ML) to data management daily. How do you separate fact from fiction?

First, AI requires data sets to train on. If you think about it, AI and Machine Learning cannot just create intelligence out of thin air. So, the question is, where are these vendors getting their training sets that make their AI useful to solve your data management problems? It’s like students needing text books/eBooks/eLearning, or apprentices needing on-the-job experience in servicing automobiles or shoeing horses. The training must come from somewhere.

AIIn the world of data management, the best source of training sets comes from metadata. The more metadata, the more patterns for AI/ML to use to observe that can result in intelligent recommendations, automatic detection of risky behavior, or complete automation of time-consuming or repetitive tasks. In summary, metadata is data about your data; what types of data, how it was accessed, the business meaning and context, when it was loaded or refreshed, how it is tagged, how it has been used and by who, which data tables have been joined, and much more. All this data provides a wealth of information for AI/ML to make suggestions and recommendations that will boost productivity of users.

If you would like to cut short a vendor evaluation, here are some questions to ask the vendor claiming to have AI integrated with their data management solution:

  1. What metadata do you collect for AI/ML training sets? Does it include all of the following:
    1. Technical metadata
    2. Business metadata
    3. Operational metadata
    4. Usage metadata
  2. What sources of metadata do you connect to and actively use today?
    1. Data Integration tools
    2. Structured data
    3. Unstructured data
    4. Cloud data
    5. Applications
    6. Analytics
  3. How much metadata do you have access to and where do you get it?
    1. For context, Informatica Intelligent Cloud Services processes 1 Trillion transactions per month. That provides a lot of metadata to learn from.
  4. Who can benefit from your AI/ML integration?
    1. Just developers.
    2. Business users who need help getting started, or knowing what to do next.
  5. What tasks can you completely automate?
    1. For example, can you automate the onboarding and structuring of unstructured data and automatically create a mapping to onboard similar data sets in the future?
    2. Can you intelligently understand the data sets that users are working with and intelligently recommend similar, useful data sets?
  6. How broad is your data management product line and which specific products have AI integrated with them today?

Just the first two or three questions should be enough to save you from an hour of “Death by PowerPoint.”

If you would like to know more about metadata and the CLAIRETM engine which provides the intelligence in Informatica’s Intelligent Data Platform, check out the white paper. We’ll show you how we are helping our customers to get more out of existing resources, enable business users, and drive data-driven digital transformation initiatives.

You can also keep an eye on Informatica blogs for regular updates on how we are innovating with CLAIRETM and AI/ML.