Many years ago (over 30 to be precise) I can recall walking the halls of more than one fortune 500 company and seeing four-foot high stacks of boxes with computer printouts in the hallway outside of managers’ offices. In fact it was not uncommon to see pallet-loads of computer printouts in some companies. When I asked one manager what the reports were and why they had so many, he said “we don’t look at the reports any more but we don’t know how to get the data center to stop sending them.”
Fast forward to 2012. Reports aren’t sent out on paper anymore due to ubiquitous networks, BI tools and desktop PCs, but it seems that many IT organizations still have not addressed the 30-year-old challenge; how do you know when business users no longer need a given report? As a result we see enterprise data warehouses with exponentially growing multi-terabyte or petabyte capacities. The amount of IT budget dollars being consumed for data that no-one uses is staggering. It’s not just storage for the data warehouse, it is also the network and server resources to extract, transform and move the data. Experts that frequently work on DW clean-up projects say that 60%-70% waste is not uncommon. As per this article by Bill Inmon, unused data can be as high as 99%!
It doesn’t have to be this way. There are in fact simple, and cost-effective, ways to monitor data usage in a data warehouse as Claudia Chandra recently wrote about in Optimize Data Warehouses with Data Usage Monitoring and Data Warehouse Archiving. If no-one is using the data, simply stop producing it, or at a minimum remove it from expensive DW infrastructure and put the information in a highly compressed form on low-cost storage.
By applying simple measurement tools along with a few lean practices, you can achieve a Lean Data Warehouse. For example:
- Respond rapidly to new business analytics needs with lean and agile data self-service profiling and mapping tools for business analysts.
- Analyze usage of production data warehouse to determine what end users are actually querying and systematically remove data or reports that are no longer of use.
- Keep aggregates and frequently accessed transaction data in the data warehouse. Purge data and eliminate loading information that is not being used at all and that has no value for compliance purposes.
- For infrequently used data that has some long term analytic value, or where retention is required for compliance purposes, move it to lower cost storage.
This is not rocket science – it just needs a bit of focus and discipline and the financial payback is worthwhile! To learn more, check out Lean Data Management. Or better yet, come to Informatica World 2012 in May. Check it out at www.informaticaworld.com.