DI without DQ is Just ETL

Many organisations today are responding to the widely heralded need for timely, trusted and accurate data to drive their business initiatives; be it operational efficiency, compliance or growth through M&A. Underpinning this drive will invariably be a series of data integration projects to address the specific needs of these initiatives.

However, far too often these initiatives, while undertaken in an environment where the importance of data quality is understood and routinely discussed, invariably become detached from the practice of actually employing data quality as an implicit component of data integration. While significant benefits can be derived by applying data quality as an explicit process organisations will struggle to realise the full benefits by not “baking it in” from the start.

By explicit I mean the process of defining what makes the underlying data “fit for purpose” at the same time as the process for sourcing, transforming and delivering this data across the enterprise. John Schmidt outlined in a blog from last year the fact that “The business operates at the speed of data” and if we are reliant on explicit processes to ensure that it is timely, trusted and relevant a real risk exists that this speed cannot maintained. This still holds today.

The adoption of such an approach will typically benefit users in a number of ways. First, it raises a “business as usual” mentality as to how data quality is addressed while encouraging and fostering effective IT-business collaboration. Second, it allows for a more potentially cost-effective solution than retro-fitting would allow. Finally, it allows for focus to be directed onto the data that drives the key business processes that underpin the initiatives called out above.

So how do I approach data quality implicitly?

  • Utilise technology that maximises the ability for IT and business to collaborate as early and as effectively as possible.
  • Drive the definition of data quality rules that can be defined once and implemented as an integral part of the integration flows as many times as required.
  • Leverage a metadata-driven approach that gives easy visibility to matching key data to key business processes.

What approach do you use? Am I missing any steps?

This entry was posted in Data Integration, Data Quality and tagged , , , , . Bookmark the permalink.

3 Responses to DI without DQ is Just ETL

  1. @Neil, I fully support the idea that DQ must be part of any DI process – whether it is “plain old” ETL, or whether it’s a more involved integration system. And I believe both our companies provide integrated solutions that promote this concept.
    However I would argue that the “T” in ETL can include “quality type” transformations. Some people refer to this as EDQL, or ETDQL (any combination of these letters will work, too!).
    The key here is to provide the right methodology, the right mindset, and of course the right technology to support these.

    • Neil Gow says:

      I totally agree that “T” can equally cover DQ transforms – however in the context of the article the point I was striving for is the one you made – the right mindset. While the specifics of where it is actually implemented is an important one I believe that it is the consideration of the crucial element of data quality at the time of integration design which is key. Such a mindset when coupled with the right technology, that can scale easily from a tactical initiative to enterprise wide solution, creates a very powerful foundation in the drive for “fit for purpose data”

  2. Pingback: Link Roundup – July 29, 2012 | Enterprise Information Management in the 21st Century

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>