The data warehouse’s goal is timely delivery of trusted data to support decision-enabling insights. However, it’s difficult to get insights out of an environment that’s hard to see inside of. This is why, as much as is possible given the necessities of data privacy, a data warehouse should be turned into a glass house, allowing us to see data quality and business intelligence challenges as they truly are.
Trusted data is not perfect data. Trusted data is transparent data, honest about its imperfections, and realistic about the practical trade-offs between delivery and quality. You can’t fix what you can’t see, but even more important, concealing or ignoring known data quality issues is only going to decrease business users’ trust of the data warehouse. Perfect data is impossible, but the more control enforced wherever data originates, and the more monitoring performed wherever data flows, the better overall data quality will be in the warehouse.
Trusted decisions are also transparent, revealing exactly what data was used to make each decision, and why. In a constantly changing business world, we often need good-enough data for fast-enough decisions. But what exactly constitutes good-enough and fast-enough isn’t something that can be articulated with general business rules, which is why decision makers must define specific data requirements for each business decision and document the preparation for, and execution of, every decision.
Without this visibility, communication breakdowns prevent business problems and related data challenges from being well-understood by the collaborative team trying to solve them. The data warehouse needs to provide a clear view of the terminology (both business and technical) surrounding its data and its processes (again, both business and technical).
Turning the data warehouse into a glass house instills trust through transparency, which is necessary in order to make the data warehouse much more understandable, more usable, and therefore more trustworthy.
You can listen to my conversation about the need for transparency in data warehousing with Reuben Vandeventer from CNO Insurance and Sean Crowley from Informatica.
Blogger-in-Chief, Obsessive Compulsive Data Quality
Jim Harris is a recognized thought leader with over 20 years of enterprise data management experience, specializing in data quality and data governance. As Blogger-in-Chief at Obsessive Compulsive Data Quality, Jim offers an independent, vendor-neutral perspective, and hosts the popular audio podcast OCDQ Radio, syndicated on iTunes and Stitcher SmartRadio. Jim is an independent consultant and freelance writer for hire, as well as a regular contributor to Information-Management.com and DataRoundtable.com.