Leave No Metadata Behind with AI-powered Enterprise Data Catalog Advanced Scanners

Metadata connectivity is a critical component of data cataloging. It  helps establish the unified metadata foundation required to drive successful data-driven digital transformation initiatives with intelligence and automation.  At Informatica, we have been working with many of the leading Global 2000 and Fortune 500 enterprises across industries and geographies to help them intelligently catalog and govern all their data at scale across a highly complex data landscape by leveraging best-of-breed metadata management.

Informatica expanded its leadership in metadata management with the announcement  that it recently acquired Compact Solutions. I had the opportunity to interview Gaurav Pathak, the VP of Product Management at Informatica to get his perspective on the significance of this acquisition.  During the course of our conversation, we tackled a number of key questions about metadata management, data cataloging, and the importance of advanced scanners – and discussed how you can benefit in the near and long term.  

Informatica recently acquired Compact Solutions, can you tell us what is the significance of this acquisition for Informatica and its customers?

First let me start by saying, I am really excited that Compact Solutions is joining the Informatica team. As you may know, Compact Solutions was founded in 2003. They are a leader in the metadata integration space with deep expertise in extracting metadata and data lineage from some of the most complex systems with their advanced metadata scanners. 

Informatica has had a longstanding partnership and a close working relationship with them over the years. We have a number of joint customers across industries and geographies. There is also tremendous synergy between our teams and product offerings – in particular, our Informatica Enterprise Data Catalog product and Compact Solutions  metadata scanners (formerly known as MetaDex). Following the acquisition, the MetaDex scanners have been rebranded as Informatica Enterprise Data Catalog Advanced Scanners.

By combining the AI-powered Informatica Enterprise Data Catalog with Informatica Enterprise Data Catalog Advanced Scanners, our customers will be able to solve some of the toughest metadata extraction and data lineage challenges and leave no metadata behind.

What are advanced scanners and why are they important?

Virtually every enterprise has complex systems in their data landscape. These include on-premises mainframe systems built on COBOL and JCL; ETL tools like IBM DataStage, Oracle Data Integrator, and SAP BODI; stored procedures and scripting languages for databases and data warehouses; as well as a plethora of enterprise applications and systems such as SAS, SAP BW, and SAP BW/4 HANA, to name a few. 

What makes them complex is the fact that the metadata and transformation logic is trapped and siloed in these systems and tools. It is very difficult to extract and even harder to understand. These systems often do not provide easily shareable descriptions of internal storage, processes and relationships.  As such, they are often referred to as “black boxes.” 

The metadata that’s buried in these complex systems is critical for a number of strategic use cases and corporate initiatives such as risk and compliance, data governance, managing data analytics, and data lineage. Today manual approaches are often required to extract data lineage from such sources. As a result, enterprises are either stuck in lengthy and costly projects or forced to leave a big chunk of their data untapped.

Enterprise Data Catalog Advanced Scanners reduce this complexity and virtually eliminate the “black box effect” by enabling users to rapidly scan, extract, and understand deep metadata and corresponding data lineage with in-depth detail.  What makes these scanners “advanced” is their ability to (a) parse code from various stored procedures (b) obtain automatic data lineage and data relationships at scale, and (c) extract deep metadata from both static and dynamic code. Extracted data lineage provides full visibility into the procedure calls with parameter tracking, dynamic SQL generation from values based on parameters, database queries, and more.  

Figure 1: Metadata extraction from a mainframe system.

How do the advanced scanners complement Enterprise Data Catalog?

Enterprise Data Catalog – powered by the CLAIRE® AI-engine – is the catalog of catalogs, with both deep and wide metadata connectivity. We have over 60+ scanners for data sources across hybrid and multi-cloud environments, including on-premises databases and data warehouses, cloud data warehouses and data lakes, BI and analytics tools, ETL tools, enterprise applications, and more. 

And we are constantly adding more scanners to the Enterprise Data Catalog roadmap. Informatica Enterprise Data Catalog Advanced Scanners complement our existing comprehensive set of scanners. They allow our customers to extract metadata from a number of complex systems, including:

  • Mainframe systems: COBOL and JCL
  • Third-party ETL tools: IBM InfoSphere DataStage, Oracle Data Integrator, Microsoft SSIS
  • Stored procedures and scripting languages: Oracle, Microsoft SQL, IBM Netezza, Teradata (including BTEQ scripts, fast loading scripts, fast export scripts and multi-load scripts)
  • BI and analytics tools: SAS, Microsoft SSRS, and SSAS
  • Enterprise applications: SAP BW, SAP BW/4 HANA

Our scanners allow our customers to extract end-to-end metadata including:

  • object metadata
  • relationship metadata
  • data lineage metadata

With these capabilities, our customers can benefit from the most comprehensive metadata repository to drive successful data-driven business transformations that support all of their key use cases and various strategic corporate initiatives. For example, customers can use these scanners to support enterprise data governance and regulatory compliance, impact analysis, advanced analytics and data science workloads, as well as cloud data lake and data warehouse modernization. 

Can you walk us through a use case or customer example where an advanced scanner was used and what challenges did the scanner address?

As I mentioned earlier, we have a number of joint customers with Compact Solutions. One such customer is a global bank based in Europe. Their key use case is centered around regulatory compliance with stringent requirements for comprehensive, end-to-end data lineage and impact analysis. 

A substantial chunk of their data lineage code was embedded in stored procedures for Oracle and Microsoft SQL Server. The bank needed access to all of their metadata, including metadata lineage relationships to support regulatory stipulations stemming from BCBS 239. With the advanced scanners, they were able to rapidly parse code from the stored procedures and obtain granular details at scale. This allowed the bank’s compliance officers and various stakeholders to better understand how key data elements were computed and create comprehensive audit trails for reporting purposes while reducing impact analysis time from several weeks to minutes.

Do enterprises need an Enterprise Data Catalog deployment to obtain the advanced scanners?

Yes, you do need to license Enterprise Data Catalog in order to purchase Enterprise Data Catalog Advanced Scanners.  Customers that have already deployed Enterprise Data Catalog can purchase the advanced scanners they need. Customers that are in the process of acquiring Enterprise Data Catalog licenses may wish to add the advanced scanners as well. To learn more about the various options that are available and for a deep dive into the scanners, I recommend that our customers contact their designated Informatica account executive or one of our domain experts.

In the long term, how do you see Informatica customers benefitting from these advanced scanners?

As I mentioned earlier, we are constantly adding more scanners to our product roadmap.  We are committed to ensuring that our customers have the most comprehensive metadata connectivity that is both broad and deep. We are working in lockstep with our customers and partners to understand their needs and priorities. To that effect, we plan on making more advanced scanners available in 2020 and beyond, including scanners for 3GL and 4GL programming languages such as Python to support use cases for AI/machine learning, and more.

To learn more, we recommend that you read the Informatica Data Catalog Advanced Scanners datasheet. Or visit us at Informatica Enterprise Data Catalog Advanced Scanners.