Lean manufacturing, as defined by Wikipedia, is “a production practice that considers the expenditure of resources for any goal other than the creation of value for the end customer to be wasteful…Essentially, lean is centered on preserving value with less work” and is a management philosophy derived mostly from the Toyota Production System (TPS). I’ve been having discussions with Jim Harris from OCDQ Blog and Reuben Vandeventer, Director of Data Governance for CNO Financial and others about best practices for data quality management and the applicability of lean management practices as it relates to data warehousing. Click here to hear Jim, Reuben and I discuss three critical areas of data quality to focus on for building data warehouses that people actually use and trust.
One of these conversations introduced me to the NUMMI plant. Now, hailing from Boston and more of a Nissan owner than Toyota or GM, I had never heard of New United Motor Manufacturing, Inc. (NUMMI). But I was fascinated quickly. Located in Fremont, CA, the NUMMI plant opened in 1984 as a bold joint venture between General Motors and Toyota. It had been a GM plant between 1962 and 1982 but suffered such poor quality, productivity, absenteeism and worker safety that it had to be shut down. The initial thought was that GM would get some technology and insight into the Toyota Production System and Toyota would get a taste of manufacturing in the US. What happened in this joint venture has been the subject of both positive and negative business case studies on the success of TPS’s adherence to “quality over quantity” as a manufacturing principle as well as the failure of organizations to develop and deploy innovative practices because of organizational structures, entrenched cultural practices and lack of collaboration.
What appealed most to me about the NUMMI story is that by retraining its auto workers to focus on quality throughout the manufacturing process, GM was able to turn around a completely dysfunctional and unprofitable plant. At its core, the Toyota Production System focuses on quality, not quantity. By applying these principles to their manufacturing line at NUMMI, GM saw dramatic results in the efficiency of their manufacturing process: the defect rate of cars coming off the line dropped substantially resulting in lower overtime to fix problems of the cars coming off the line and higher initial satisfaction by car owners. As a result, the profitability of the cars they produced at the NUMMI plant increased, and the attitudes of its workforce there took a 180-degree turn for the better. By focusing on quality throughout the manufacturing process – even if it meant stopping the line (traditionally a verboten action in GM plants) to fix a problem that a worker identified – GM was able to streamline their manufacturing process and produce higher quality cars more cost effectively.
So, when I read that integration costs for data warehouses typically run between $250K and $1M, or that only 3 in 10 organizations view the data used in their analysis as always accurate, or that on average, it takes 7.8 weeks to add a new data source to an existing data warehouse, I think that there is definitely room to improve the processes in our data warehouse factories. If you want to build a data warehouse that people actually use and trust, then you need to focus on data quality early and often during the manufacturing process:
- Understand your data – profile your data sources, get comfortable with the content, quality and structure of your data as it resides across your organization. Armed with this insight, visibility and knowledge, you can avoid costly re-work later on in the testing phases of the project.
- Cleanse your data – you need to parse, standardize, match and cleanse data so that it addresses all dimensions of data quality and becomes a foundation for data governance
- Define & Document your data and data flows – ensure that you have complete visibility into business, technical and operational metadata. When you facilitate collaboration between business users and IT teams you’ll increase the trust and confidence in the data. If you provide tooling so that business users can help identify key data domains and sources of data and you’ll accelerate end-to-end lineage of data.
These are just three areas where focused efforts can increase the efficiency of your data integration processes, accelerate the time to value for your data warehouse, and dramatically increase the trust, confidence and utilization of a newly deployed data warehouse.
How have you addressed the delivery of trusted data from your data warehouse factory? Do your efforts align with increased utilization of the data warehouse? If not, why do you think that is? I’d love to hear your thoughts.

Good article. A funny thing happens when we shift the focus to quality – everything speeds up. The reality is that many of the quality issues are because of incomplete information being passed from team to team which ends up resulting in delays and rework. So when we focus on quality, speed increases and costs go down.
You might want to check out a blog post I wrote last year about the Lean Data Warehouse http://blogs.informatica.com/perspectives/2012/01/26/lean-data-warehouse/ or check out the Lean Integration book at http://www.integrationfactory.com.
John