The Next Generation of Metadata Management (Part 1)
Metadata Management originated in the late 80s and early 90s, when Data Warehousing was sowing its roots. Businesses had expanded their definitions for data warehousing to include three types of tools:
- Business intelligence tools
- Tools to extract, transform and load data into the repository
- Tools to manage and retrieve metadata
The origins Metadata Management lay here. The fundamental use case was to support the ETL tools for debugging and manage metadata. The 3 main use cases for early Metadata Management solutions included:
- Display of Data Lineage
- Impact Summary and Analysis
- Unified view in a Metadata Catalog
This meant that, in addition to closely being tied to the ETL tool used in IT, metadata management solutions included the fundamental use case to increase productivity with Impact Analysis, provide visual representation of data movements with lineage and also store all related metadata of the data warehousing solutions in its own metadata catalog.
Role of Data Governance in the expansion of Metadata Management
Over the years, the data began to explode and decision making increasingly relied on data analysis. In addition to this, business processes began to consume increasingly large amounts of data created in distant parts of an organization for a different purpose. Then the requirement came it to profile and discover the data and tie it to the business outcomes. This was fundamental to the success of Data Governance initiatives of the new millennium.
Various business-led governance initiatives understood the need to care for the enterprise wide data and created collaborative processes to manage a core set of data that were deemed critical for the business. More significantly, the whole process was tied to a policy-centric approach to data quality standards, data security, MDM and lifecycle management.
Below is a graphical illustration of the 5 pillars of data governance and there was the increasing need to closely connect Metadata with the business outcomes including Data Quality and Data Security.
Metadata and Business Glossary had to be closely associated with each other. Otherwise, how would you connect to let’s say – a table attribute that could be something abstract to a meaningful business definition? Also there was a need to seamlessly collaborate between the IT and business which would involve a common solution to tie these aspects of technical metadata with business definitions.
This means that the metadata solution had jumped one place up to closely tie to the data governance use case and cater to the business led initiatives. From that point on, metadata and business glossary were inseparable within the data management context.
Role of Big Data in shaping the Enterprise Catalog solutions
The Metadata solutions didn’t stop just there – did it?
When we see how the data flows through these big data ecosystems, we still haven’t lost focus of data governance where the business analysts would still want to search for what they’re looking for and find from the results either based on relevance or the data they trust. This again ties to the data lineage of where the data came from, the definitions and semantics of what the data means, when it was last updated or defined.
There is also the next big factor of data security where there is a need to understand who’s been looking at what data, and whether or not they were allowed to, while providing the authorization and authentication to the data that we like.
Where is all this information stored at – It is the Enterprise Metadata Warehouse or the Enterprise Metadata Catalog which is the central concept to the next generation of Metadata Management solutions.
Retrospective View of Metadata Management vis-à-vis the Next-Gen Solutions
If we have to briefly compare the evolution of metadata management, it is the coalescing of next-gen technologies including machine learning with the policy driven era of data governance. The entire core intelligence of platforms relies on the metadata warehouse and the below snapshot can articulate this aspect:
In the next part of the series, we take a detailed look at the need to have an enterprise wide catalog.