I told the head of the Enterprise Data Warehouse at a large bank, “you don’t have a data warehouse, you have 50,000 tables.” The issue is that the bank built the EDW without the necessary fundamentals in place. It wasn’t for lack of money; in fact the EDW was one of the biggest “money sinks” in the bank. The problem is that it was sitting on a sinking foundation.
One version of the truth isn’t achieved by putting all your data in one big system or one big database – that’s impossible. An enterprise data warehouse is indeed part of the solution, but it needs to be built on a solid foundation. What does a solid foundation look like? Here are five pillars for one version of the truth.
- Establish a Metadata Management Office (MMO) that is responsible for usability and performance of the metadata system. This involves a disciplined approach to deriving value from a federated collection of repositories about data at rest, data in motion and data changes. Check out my prior post for more on this topic. The first pillar of the foundation is to have a permanent organizational group that is the equivalent of the accounting department for financial assets. Just like the accountants don’t make the investment or financing decisions, the MMO doesn’t define business terms or establish security policies, but they can tell you where the data is and how it’s changing.
- Implement a Business Glossary or Data Dictionary based on logical models for business domains and data stewards to maintain the definitions. Note that I didn’t say “an enterprise logical model.” While enterprise reference models are indeed valuable for strategic alignment and business transformation planning, logical data models are most often successful at the business domain level since they generally die under the weight of complexity at the enterprise level. Furthermore, note the term “maintain” – definitions are not static and need owners to keep them relevant as the world turns and the business and IT evolve. This is the second pillar for one version of the truth. Often-times multiple versions of the truth is a definitional problem in the eyes of the beholder – it’s not the data’s fault.
- Measure data quality on an ongoing basis. It goes without saying that “you can’t manage what you don’t measure” yet many organizations still try to manage data without metrics which results in a piecemeal, ad-hoc and fire-fighting approach to improving data integrity. The third pillar is to get data clean and keep it clean by establishing data quality standards, measuring it regularly and communicating the results to both management and front line staff. In short, shine a bright light on data quality.
- Establish clear accountability for business systems and integration systems such as the enterprise data warehouse or MDM system. Multiple versions of the truth are often a result of the same information being captured and stored in slightly different ways by different systems. Before you can resolve the differences and identify the system-of-record or source-of-record, you need to know who to talk to. Each system should have a business owner who drives the investment and prioritization decisions, an IT owner who leads the planning and change management activities, and an operations owner who is responsible for maintaining service levels and resolving production incidents. The fourth pillar therefore is a definitive list of applications (yes all of them in the organization) and clear accountability for each of them.
- Establish a Data Governance program to resolve disputes, define policies for data access, set priorities for data improvement initiatives and maintain a data risk register. For the data governance council to be effective, it needs input from the other four pillars, and it needs the other four pillars to implement and execute its policies and decisions. Data governance without DQ metrics, MMO data lineage or change impact reports, unclear system accountability, or a business glossary to communicate the results is like trying to govern by tribal knowledge. The fifth pillar therefore cannot stand on its own, and is an essential stabilizer for the other four pillars. For more on this topic, check out Rob Karel’s recent post on Informatica’s Data Governance Framework.
Before you are scared off by thinking “Wow, this is a tall order. We could never achieve all these disciplines in our organization.” You should know that it takes time to build these capabilities and you don’t need to be “perfect” with all of them from day one. It takes most organizations from two to five years to build mature capabilities in each area. The point nonetheless remains – achieving one version of the truth is not simply a technical problem – it is an agreement problem which requires discipline in all five pillars.