5 Key Principles for overcoming the challenges of Enterprise-wide Data Governance

In this blog, we explore the 5 Key Principles required for overcoming the challenges of Enterprise-wide Data Governance.

For many Global Systematically Important Banks (GSIBs) the initial deadline for BCBS239 compliance passed on 1st January 2016 so I thought I’d outline some of the challenges they, and other financial organisations, have faced and some of the approaches taken in pursuit of becoming compliant and staying compliant.

What is significant about the approaches that work well is that they are horizontal in nature i.e. they can be applied to any Data Governance requirement as well as being specialised for any specific use case such as BCBS239 compliance.

Why is Enterprise-wide Data Governance challenging?

Some of the main reasons why this has been challenging include:

  • Most Banks have a high degree of organisational & operational complexity to navigate
  • Many Banks have Business units that have siloed operations & many, many applications
  • There are no, or few, agreed definitions for Key Data Entities (KDEs) across a Bank
  • There is limited visibility of the cross enterprise, end to end data pipeline
  • There is little or no linkage between business models and the physical world
  • Data Quality checks are performed in silos with manual interventions
  • Data Governance & Data Management is often tackled bottom up
  • Tackling Data Governance as a ‘project’ or by throwing time, money and people at it

In a recent blog post (BCBS239 for DSIBs and 5 things We have Learned from their Global Counterparts) I outlined some of the key learning’s we’ve gained whilst working for Banks and I outlined some approaches that appear to have worked well.

From a more practical viewpoint, we’ve also observed some key solution design principles that seem to have contributed to positive outcomes.

5 key solution design principles

These design principles reflect a top down approach, starting with the Policy definition statements and links them with implementation and enforcement artefacts that extend all the way down to the physical layers of data. The 5 key solution design principles, outlined below, reflect this approach and concentrate on linking data and metadata at both a logical and physical level.

Our observation has been that most Data Governance programmes struggle with reconciling the definitions of key business entities to their physical realisations in applications and systems.

This is where the 5 key solution design principles play a significant role in improving outcomes.

The 5 key solution design principles are:


  1. Use a top-down approach to discover and document the information landscape across the enterprise
    • This starts with looking at what data and information is logically generated and consumed across the enterprise. This gives us a view on what the business sees as critical to the operations of the Enterprise; information that will be captured includes items such as major data sources, key applications and high level business data models.
    • Top down is important as this approach requires the Business to recognise and focus on the core data entities that are truly critical to the business, at Enterprise or Line of Business levels.
    • Once consensus is reached on which data entities are critical, they can easily be linked to their physical realisations; it is our observation that the reverse (abstracting definitions from technical detail) is a time consuming and complex task that struggles to yield enduring success. The Business Model information gathered can be documented easily in a Business Glossary that has the capability to support this type of business metadata.
  2. Discover, capture and standardise definitions of Key Data Entities and flows through the enterprise
    • Once the Key Data Entities have been identified, more detail can be added to their definition, including their type (Enterprise critical, Line-of-Business critical, etc.), scope (Business Unit, Geography, etc.) or Domain (Risk, Finance, etc.). At this stage the Data Entities are still logical in nature but containing the data attributes the business needs.
    • Associated with the Key Data Entities are the flows these take across the information landscape and represent the view of how the business sees information movement, we call this Business Data Lineage. Standardisation is important to ensure consistency of business entities, the attributes that make up these entities as well as how they are defined. All Key Data Entity metadata is stored within a Business Glossary.
  3. Discover, define and standardise Business & Quality Rules associated with the Key Data Entities
    • Next we go and find the Business and Quality rules associated with the Key Data Elements which are used to define the Key Data Entity metadata as well as how it may be consumed.
    • The definition of Business and Quality rules is greatly facilitated by the availability of Business Data Lineage: it’s through this lineage that we understand where we will need to apply completeness, consistency or integrity checks across system boundaries.
    • Some rules may be simple (for example: a specific field must have a value greater than 0) to more complex (for example: client id field cannot be empty and the range of the value in that field has a specific meaning) in nature. Again standardisation is important to ensure consistency and to enable reuse. All Business and Quality rule data is stored in the Metadata repository.
  4. Expose Business Model and link it to Physical Models
    • The Business Model gathered so far provides a representation of the Enterprise Systems and Processes, the Key Data Entities that support them and documents how these Entities flow through the landscape via the Business Data Lineage representation.
    • The Business Model also provides an anchor point to enable us to link the logical world and the physical/technical worlds. Metadata describing the physical data stores (applications, databases, etc.) can be easily imported as can the technical flows of data that have been implemented in a range of technologies such as Informatica’s Enterprise Data Integration or Data Quality tools.
  5. Set data quality controls at key points in the architecture
    • As we now have a link between the logical and physical world we can apply the data quality controls (definition of how an organisation implements a Policy) at key, strategic points in the architecture. This has the effect of enabling the measurement of the quality of the data as it flows around an information landscape. It is this approach of lineage and measurement that exposes what information is actually flowing around an organisation as well as its quality in such dimensions as Accuracy, Completeness, Integrity and Timeliness.
    • As data flows around the information landscape and its quality is assessed, all of this metadata can be stored in an external database for easy retrieval. This simplifies the approach to generating the dashboards, which Executives need to quickly understand how good the quality of their data is at any point in time and at any point in its lifecycle. This dashboard approach quickly exposes any areas where the quality of data is poor as well as the points in the information landscape that contributes to that fact.

The key enabling factors supporting the 5 principles approach are a combination of three sets of capabilities. These are:

  1. Implementation based on the Informatica industry leading data platform that is entirely metadata driven with a configuration, not customisation delivery model for speed and ease of delivery
  2. Accelerating assets to implement the appropriate process and methods such as a business model, templates for business model representation and diagraming plus data model and reporting schemas
  3. Implementation of a best practice process to deliver a clear understanding, definition and visibility of an enterprise business model, creation of a Catalogue of enterprise Key Data Entities, creation of Business Data Lineage to document and manage flow of information across the enterprise plus measurement and reporting of the quality of Key Data Entities across their lifecycle

Benefits of this approach

We’ve seen a number of areas of benefit from this approach and some of these are documented below:

  1. Business Users have visibility of what’s really happening to data
    • All defined and generated metadata is visible and available for analysis using reports, scorecards, dashboards, alerts and analysis
    • As data flows through the solution any Business or Data Quality Rules that get executed populates data in a datamart, Business or Data Quality metrics get generated and added to the datamart with alerts, the current state is monitored and alerts generated when metrics exceed specific limits as well as exception processing that triggers workflows for remediation
  2. Reduces the time taken to implement and increases accuracy of outputs
    • The top down approach drives quicker and simpler physical model alignment that ensures only Key Data Entities are included for reporting or aggregation
    • A completed Business Glossary means only the right Key Data Entities are used and that they have known quality attributes that increases trust in the data plus increases confidence in reporting and aggregation results
  3. Enables a repeatable foundation
    • This approach builds a repeatable foundation to support the next Regulatory Compliance requirement where the delta would be the Key Data Entities relevant for new compliance requirement, Business & Data Quality rules for new compliance requirement, linkage for new KDEs and new Business & Data Quality Rules to the new physical world plus Regulation specific reporting, analysis, alerts or scorecards
  4. An approach that drives Best Practice
    • This approach enables organisations to rapidly create a robust and rigorous data management environment for Data Governance, as well as specialisations such as BCBS239, to enable rapid demonstration of compliance as well as providing an industrialised foundation to ensuring continual and on-going compliance demonstration with minimal incremental time, cost or effort

What next?

As more organisations examine how to simplify the delivery of Data Governance, albeit generically or a specialisation such as BCBS239 the more it is recognised that a top down approach, with linking of logical and physical metadata, automating the generation of business lineage insights and basing all this on a platform that drives reuse is the optimal way to move forward.


I would like to thank Giuseppe Mura for his contribution to this blog post.