You Want the Truth? You Can’t Handle the Truth!

So, I’m a complete sucker for the courtroom scene in the Rob Reiner film “A Few Good Men” for a number of reasons: it was written by Aaron Sorkin (loved West Wing and Sports Night), it’s a classic Jack Nicholson scene, it is one of Tom Cruise’s roles where he actually does some acting, and it’s a great 6-degrees of Kevin Bacon movie (I mean, it’s got people from Demi Moore to Cuba Gooding Jr. to Kiefer Sutherland, to Noah Wylie – think of the connections you could make with just those actors). I get sucked into this scene whenever I click by it on the television.

For those unfamiliar with the movie, Tom Cruise plays a lawyer tasked with defending two Marines on trial for murder and Jack Nicholson stands in the way of him winning the acquittal of his clients. In the culminating scene of the movie Tom Cruise goes on the attack against an agitated Colonel Nathan Jessup, played by Jack Nicholson.  It’s a powerful exchange:

Jack Nicholson: “You want answers?”

Tom Cruise: “I think I’m entitled.”

Jack Nicholson: “You want answers?”

Tom Cruise: “I want the truth!”

Jack Nicholson:




After at least 30-40 viewings, this scene still gives me the chills.

I caught this scene just the other day and it struck me that in the context of data warehousing, this may be a conversation overheard between business users and the data warehouse delivery team (albeit without all of the drama and murder implications). However, the ability to deliver accurate, timely and complete data – in essence, “the truth” – is pretty much the definition of a successful data warehouse. Unfortunately, most data warehouses fail to achieve this definition of success. Most data warehouse deployments today cost too much, take too long to deliver, don’t scale effectively, and end users don’t trust the data they get from it.

The hard-to-handle truth is that integrating data is inherently a dirty business – data is scattered across the organization in different formats, in different applications, many times it has no metadata to provide any context to what it is, and a lot of the time there is only partial data. The IT landscapes of today’s modern corporations can be so complex that asking simple questions often times brings back multiple (and conflicting) answers. You combine these factors with limited budgets, decreased staffing and still only 24 hours in the day, and it’s no wonder why these are challenging to organizations.

End users want nothing more than to have confidence in their decision making, and that starts with how much they actually trust the information they are using to base those decisions. If the data warehouse can’t deliver trusted data, then decision makers won’t use the warehouse as a source of information. All too often, people cut corners when building the warehouse: they don’t take the time to profile the data before it gets loaded into the data warehouse, they don’t discover all of the relevant data domains to begin with, and they fail to instantiate data cleansing logic to ensure the accuracy and completeness of the data. So, in essence, what happens to the data warehouse is:







To truly deliver value to the organization, data warehousing and data integration teams need to see data quality as an essential step in the successful deployment of any next generation data warehouse. Click here for an example of how HealthNow New York recently deployed a next generation data warehouse and delivered trusted data wherever and when ever it is needed. Sometimes it gets ugly, sometimes it takes a lot of time to discover, profile and cleanse data – but the time spent on those tasks up front will generate greater trust in the data, ensure higher utilization rates of your data warehouse and ultimately deliver the “truth” that decision makers desire.


This entry was posted in Data Integration. Bookmark the permalink.

3 Responses to You Want the Truth? You Can’t Handle the Truth!

  1. Jim Harris says:

    Excellent post, Sean.

    I too am a fan of that amazing scene in A Few Good Men, and I agree that it makes for a good analogy for the hard-to-handle truth about the dirty data quality business of building a trustworthy data warehouse that can deliver accurate, timely and complete data.

    I like to re-imagine Jack Nicholson as Data Warehouse Director Nathan Jessup, defending the efforts of the data warehouse team to executive management in a movie about A Few Good Knowledge Workers:

    “You want the truth? You can’t handle the truth! We live in a world that has data and the quality of those data need to be guarded by workers with knowledge. Who’s gonna do it? You? I have a greater responsibility than you can possibly fathom. You have the luxury of not knowing what I know.

    You don’t want the truth because deep down in places you don’t talk about at board meetings, you want me on that data, you need me on that data!

    The data warehouse team uses words like completeness, consistency, accuracy, timeliness. We use them as the backbone of a career spent trying to defend the data that this company depends on to make business-critical decisions. You use them as bullet points on a presentation slide.

    I suggest that you pick up a pen and sign the authorization for the deployment for our next generation data warehouse!”

    • Sean Crowley says:


      Thanks for taking this to the next level.

      I think that data warehouse directors and managers alike can certainly agree that all too often the business and IT teams feel like they are pitted against one another, rather than on the same team.

      It is important to note though, the relationship between the business users and the data warehousing team doesn’t have to be so stand-offish. Self-service capabilities are emerging as an excellent way to more effectively engage the business users in solving the complexity problem: to help define what is important to them, to accelerate the development efforts of data warehousing team, and to aid in the delivery of the vision and insight they desperately seek.

      So I’ll leave this reply with a follow on question:
      What tactics have you (or anyone else on this thread) seen as being particularly effective in engaging the business users in data warehouse development? Where have you seen the biggest gains in time to value: business glossaries development? data profiling? virtualization as a tool for prototyping?

      • Jim Harris says:


        I would say that the key to effectively bringing together the business and technical teams involved in data warehouse development comes down to increasing mutual understanding through transparency, particularly by providing a clear picture of the terminology (both business and technical) and the data.

        Metadata management (e.g., the business glossary development you mentioned) and data profiling are overlooked or oversimplified, but vital and ongoing — not once and done — tasks, and this often leads to a lot of dangerous assumptions about the business problems and related data challenges being well-understood by the collaborative team trying to solve them.

        As much as is possible (i.e., given the necessities of data privacy), a data warehouse should be a glass house, so that everyone can clearly see what they are trying to accomplish, because it’s hard to generate insights from an environment that’s hard to see inside of.

        Best Regards,


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>