blogs.informatica.com
informatica.com my.informatica.com Developer Network Worldwide Sites
Informatica: The Data Integration Company

HomeHome

Best Practices Archives

May 09, 2008

Data Quality Maturity Model – How Does Your Organization Rate?

Posted by Chris Cingrani in: Data Quality > Best Practices

Chris Cingrani

Recently I spoke at a User Group Meeting on the topic “Align for Success: The critical part Data Quality plays in complex Business and IT Initiatives.” I began the discussion by polling the group to find out how many of the organizations represented had a data quality solution in place. The response to the question was mixed, with approximately half the audience indicating they either had a solution or were considering one, while the other half indicated they weren’t currently considering data quality (or the person was unaware of any data quality initiatives). Although this was a very unscientific survey, it set the tone for my presentation, as I attempted to explain the concept of a data quality maturity model. By understanding where an organization is today from the standpoint of the model, management can begin to develop plans as to where they want to end up both in the short and long term.

Continue reading "Data Quality Maturity Model – How Does Your Organization Rate?" »

February 01, 2008

You can’t have CDI without Data Quality

Posted by Tom Golden in: Data Quality > Benefits ; Data Quality > Best Practices ; Data Quality ; Data Quality > Technology

Tom Golden
Looking in Webopedia.com recently I came across a definition for CDI. Yes webopedia.com - it bills itself as the #1 online encyclopaedia dedicated to computer technology. You might wonder what I was doing surfing this font of knowledge – well I had time on my hands between delayed flights coming back to Europe from the US. You know what they say “time to spare, travel by air.”

The Webopedia.com CDI definition went: “Short for Customer Data Integration, it is the combination of the technology, processes, and services needed to create and maintain an accurate, timely and complete view of the customer across multiple channels, business lines, and, potentially, enterprises, where there are multiple sources of customer data in multiple application systems and databases.”

A bit long winded perhaps, but the three words that shone out at me through the glare of the florescent lights in San Francisco airport were “accurate, timely and complete”; all data quality issues. Despite this, few if any of the Customer Data Integration (CDI) vendors in the market today have truly addressed the data quality issues in their CDI solutions. And anyone who has gone down the route of developing their own custom-built CDI application will be all too familiar with the data quality demands involved.

Continue reading "You can’t have CDI without Data Quality" »

December 20, 2007

Better management through measuring data quality

Posted by Ivan Chong in: Data Quality > Best Practices ; Data Quality ; Data Quality > Monitoring > Metrics ; Data Quality > Monitoring

Ivan Chong
I recently asked a customer of ours why they invested so much in monitoring and publishing key performance indicators for their data quality. “Believe it or not, the biggest reason we measure data quality is not to correct bad data” came the reply. “The reason we monitor data quality is to detect problems with our business processes.”

Indeed, as I mentioned in my last blog post, business users look to investments in people and processes in addition to technology in order to address poor data quality. For example, if a bank branch manager received a report showing that customer data originating from his branch office had much higher incidents of duplicate entries and was putting the entire bank at risk of massive regulatory fines, he is not going to throw technology at the problem. His response might be mandatory training for tellers or better hiring practices to screen for adequate computer skills.

Experts in quality control methodology refer to this as addressing “root cause.” Common starting points of measurement involve completeness, accuracy, consistency, conformity, duplication, and integrity. Eventually, as the business culture matures its data quality practices, timeliness and data lineage (origination) are used to evaluate quality of data. Of course, software technology that automates the process of parsing, standardizing, matching and consolidating data is of immense value and is an absolute requirement in any data integration project. However, the issue of data quality goes beyond these IT projects. Ongoing measurement and monitoring of data quality provides value directly to the business because it helps them to better manage their people and processes.


December 06, 2007

Start small with monitoring, but always think big to achieve data quality goals

Posted by Tom Golden in: Data Quality > Best Practices ; Data Quality ; Data Quality > Management ; Data Quality > Monitoring > Metrics ; Data Quality > Monitoring ; Data Quality > Monitoring > Scorecards

Tom Golden
I attended my first parent-teacher meeting the other day for my five-year old daughter. Another one of those “life stage” events done and dusted – I remember dreading the annual meeting when I was a kid. The notion of my parents and my teacher comparing notes on my behaviour was too much to bear – somebody was eventually going to put two and two together and find out I was up to no good.

It all got me thinking about a recent blog post by my esteemed colleague Garry Moroney. His post Mobilizing the Data Quality Army outlined the level of effort, thought and planning that the US Department of Education is putting into data quality.

As Garry points out dealing with data quality in a large, disconnected organization such as the US schools system is not a trivial exercise. But if you were to only read that one post you might be overwhelmed by the potential size of the data quality task in front of you.

Continue reading "Start small with monitoring, but always think big to achieve data quality goals" »

December 05, 2007

Business and IT Collaboration is Essential for Data Quality

Posted by Ivan Chong in: Data Quality > Best Practices ; Data Quality ; Data Quality > Management ; Data Quality > Technology

Ivan Chong
A recent InformationWeek article* described the growth in IT employment across the US as a result of a shift in skills. Rather than focusing on pure IT proficiency, organizations are looking for talent with “a more hybrid mix of technology skills, along with an understanding of the business and its customers.”

IT departments are highly motivated to increase the level of collaboration with their counterparts in the business. Nowhere is this more critical than in the area of data quality and the trend is causing a shift in the way companies are looking to solve their data quality issues. First generation data quality tools had a natural focus on technology, instead of business. Here are some of the differences between technology focused data quality solutions and business-focused data quality solutions.

Tools vs. Process
Technology focused data quality solutions provide tools that automate data processing. Evidence of this type of focus can be seen in the way that vendors will tout the sophistication and type of their algorithms over and above their ability to support ongoing data quality management processes. While technology is extremely important, its relevance cannot eclipse the overall data quality management process. Even if your data quality tool can automate the correction of 95 percent of the data, if the remaining five percent cannot be managed properly, you will continue to suffer from poor data quality.

Continue reading "Business and IT Collaboration is Essential for Data Quality" »

November 28, 2007

Building the Business Case for Data Quality

Posted by Chris Cingrani in: Data Quality > Best Practices

Chris Cingrani

As a new contributor to the data quality blog site, I wanted to start by introducing myself and highlighting the types of topics I plan to discuss on a semi-frequent basis. I am a Principal Consultant with Informatica Professional Services and have spent the past 6 years in the data quality space in a variety of sales and post sales roles. During this time I have seen the data quality market continue to evolve and mature. Thus, I would like to use this column to reflect on the types of use cases I have seen and continue to see when meeting with organization’s faced with data quality problems. I hope these posts can start an active dialogue, regardless if your company is trying to tackle their first data quality initiative or looking to build out a formal center of excellence around data quality.

To start, I wanted to pose a common question I am often asked by clients and prospects – how do I build a business case for data quality? Although an organization may think (or even know) there is a problem, the need to justify the cost around procuring a data quality solution often exists. This justification requirements often comes from the idea that data quality issues aren’t necessarily a core business issue (how wrong this is!) or something that can be handled through manual intervention (this is true – if you have unlimited time and money, but even then your results will be limited). Thus, the following points are meant to help start an organization down the path to building the internal business case through a Data Quality Audit. Note - if you have access to Informatica’s Velocity Methodology, I go into these steps in further detail in the best practice document, “Developing the Data Quality Business Case.”

Continue reading "Building the Business Case for Data Quality" »

June 28, 2007

Information Quality & Management Transformation

Posted by Larry English in: Data Quality > Benefits ; Data Quality > Best Practices ; Data Quality ; Data Quality > Management ; Data Quality > Monitoring

Larry English
I recently received an email from one of my early clients. After having worked in four different companies in four different industries, she came to a sad conclusion, writing:

“The thing that they all have in common is a desire to cut corners and deal with quality later. It takes a lot of energy to be the information quality cheerleader, and I find it discouraging and overwhelming at times. Keep writing your articles and books to encourage all the people like me who are dealing with these issues every day.” P. G.

The discovery that P. G. has experienced is, unfortunately, the norm—not the exception. There are two critical elements in this experience.

Continue reading "Information Quality & Management Transformation" »

June 22, 2007

Alice in “Qualityland"

Posted by Neil Gow in: Data Quality > Best Practices ; Data Quality ; Data Quality > Governance / Stewardship ; Data Quality > Management

Alice: Would you tell me, please, which way I ought to go from here?
The Cheshire Cat: That depends a good deal on where you want to get to
Alice: I don't much care where.
The Cheshire Cat: Then it doesn't much matter which way you go
– Lewis Carroll, Alice's Adventures in Wonderland

Chris McCauley

When confronted with the problem of how to address their data quality issues many organisations are faced with a similar dilemma to that which confronted Alice during her travels in Wonderland; “I know that I need to do something, but I don’t know where to start”. Knowing where to start and, equally importantly, the size of the problem as well as where an organisation needs to go are critical factors in ensuring that their data quality journey takes them where they need to be at the price they are prepared to pay.

When planning their “journey” organisations need to address the issue of data quality holistically by considering each of the three DQ pillars in turn; firstly “People”, then “Ideas” and finally “Technology”. Many DQ initiatives have failed as the primary focus has been on delivering a technical solution. However without the right framework in place and operated by the right people this approach will never deliver the results that organisations need. Time and time again within the IT industry it has been proved that the pure application of technology will never solve business issues, as technology in itself will never win the “war”, it is always the right people with the right ideas who use the technology in the right way.

Continue reading "Alice in “Qualityland"" »

June 05, 2007

Mobilizing the Data Quality Army

Posted by Garry Moroney in: Data Quality > Best Practices ; Data Quality ; Data Quality > Governance / Stewardship ; Data Quality > Management

I’ve just been reading a US Department of Education briefing document on improving data quality in education performance data. The report stresses the impact that low quality data can have on measuring the success of education programs. It discusses for example the numerous data quality problems identified in the “No child left behind” program established in 2001. The problems are typical – non-standardized data definitions, inconsistent data from different sources, data entry errors, lack of timeliness.

The briefing document outlines a broad set of data quality guidelines to be implemented right across the education system in the US – at State level, in Local Education Agencies (LEAs) and in schools themselves. The three foundation stones of the data quality framework outlined are:

• suitable technical infrastructure,
• a comprehensive dictionary of data definitions
• staff ownership, organization and training

Continue reading "Mobilizing the Data Quality Army" »

May 21, 2007

IQ in Internet and e-Business Information

Posted by Larry English in: Data Quality > Best Practices ; Data Quality ; Data Quality ; Data Quality > Monitoring

“In e-Business, the Information IS the Business”
Having just completed writing a chapter on “IQ in the Internet and e-Business Environments” in my forthcoming book, Information Quality Applied: Best Practices for Improving Business Information, Processes and Systems (John Wiley & Sons), I wanted to share a few excerpts from this chapter. This is one of ten chapters focused on applying sound quality principles to the unique quality issues in various information value “circles” such as “Prospect to Satisfied Customer,” “Order to Cash” Supply chain, for example.
There are three categories of information in the Internet environment to which quality principles must be applied:
* Web-Based Documents and Web Content
* Data “Shared” by Internal Processes and Internet Processes
* Information Collected or Created in e-Commerce and e-Business value chains, including third party business partners
The major problem with IQ in the Internet is that business is conducted in “cyberspace” with no person “minding the store” or monitoring the e-Business transactions.
Here I will address some problems and improvements in the first category.

Continue reading "IQ in Internet and e-Business Information" »

February 19, 2007

Data Quality Metadata; a lot more than just "data about data"

Posted by Chris McCauley in: Data Quality > Best Practices ; Data Quality ; Data Quality > Technology ; Data Quality > Vertical Solutions

Chris McCauley
On reflection, dipping into details of matching technologies in my last blog entry wasn't that much of a detour from the subject of metadata. It broached the idea that one technology was better than another because of its ability to better handle the context in which it was used. There are a number of themes that run through my work on data quality at Informatica and one of them is "metadata as context".

By way of explanation, let's pretend that you are the newly hired Head of Engineering in a software company bringing a product to market (maybe a new operating system). The quality assurance team has completed its testing and you've just been told that it found no "show-stopper" defects, some usability "gotchas" and a smattering of documentation problems. The pressure is on; Sales has been promising these features to customers for weeks and Marketing announced this stuff months ago: to ship or not to ship, that is the question.

The QA stats are on the surface very reassuring, but we need to know some more about how the team arrived at those numbers before we start pressing CDs. If you found out that the team had been added late in a very long development process and was unfamiliar with the product would you still be sitting comfortably?

You can see how the words "context" and "metadata" could be used interchangeably when thinking about the scenario above. Harking back to the discussion about Probabilistic matching systems, there's value in understanding the context in which you are operating. Metadata can be used to capture some aspects of context hence "metadata as context".

Continue reading "Data Quality Metadata; a lot more than just "data about data"" »

February 02, 2007

Valuing Data Quality

Posted by Garry Moroney in: Data Quality > Benefits ; Data Quality > Best Practices ; Data Quality ; Data Quality > Management

Garry Moroney
Determining the aggregated return on investment for a data quality management initiative is notoriously difficult. Typically a minimum or partial ROI can be estimated by reference to the impact of low quality data on one or two key projects or processes. For example in a CRM project data quality ROI can be tied to reductions in customer contact failures and increased sales due to high quality segmentation. But given that the same set of master data will be used more than once in most organizations (i.e. customer master data will also be used in the billing system, the supply chain system and so on) and will add value (or destroy value!) in all of these processes, basing your ROI calculations on a single system or process will always underestimate the true returns.

For an organization trying to estimate the total returns across the enterprise from a data quality initiative, there are two difficult questions that must be addressed:

• How valuable is this dataset to the enterprise - assuming 100% data quality?
• How does its value decrease as quality erodes?

While these questions might at first seem unanswerable, it is worth noting that these are not unusual questions for a business to ask. In fact businesses need to be able to answer questions of worth and depreciation for all their tangible assets - property, stock etc.

Unfortunately data is one of those intangible assets where normal valuation approaches like recorded cost or replacement value are ineffective. But there are other intangible assets such as IPR, work-in-progress, customer and partner relationships (good will) where significant research has been done to develop effective valuation methodologies. It just might be possible to leverage these methodologies to value your data. For example, the value of customer data is directly related to the value of the customers themselves and so "customer lifetime value" methodologies should be applicable in estimating the value of customer data and the extent to which this value varies with data quality.

Have any of you out there attempted to put a real value on your company data in this way? Perhaps you'd be willing to share your experiences with us.

For more information on building a business case for data quality and calculating potential return on investment see the Informatica white papers: Data Quality Profiling Calculating ROI for Data Migration and Data Integration Projects
and The Data Quality Business Case—Projecting Return on Investment.

December 15, 2006

Data Quality and Data Integration; Bread and Butter or Chalk and Cheese?

Posted by Garry Moroney in: Data Quality > Best Practices ; Data Quality

Garry Moroney
There is no doubt that data quality and data integration are intrinsically symbiotic activities, but where does one end and the other begin? Is data quality just part of data integration? Are they separate but linked or one in the same? Some industry observers are convinced that data quality is purely a component of data integration, and that it only makes sense to deploy data quality as an integral part of the data integration process.

As head of Similarity Systems, a company that provided pure data quality solutions prior to its acquisition by Informatica, I would have disagreed strongly with this point of view. For me data quality and data integration were linked but separate disciplines. Data quality technology is required in many different parts of an organization and is often controlled by different owners than those who require data integration.

Continue reading "Data Quality and Data Integration; Bread and Butter or Chalk and Cheese?" »