Understanding the Impact of Data Issues

Financial institutions need to be able to express the impact of data issues to fully understand how big an issue they really are. Let me explain.

What do we mean by ‘Impact’?

Understanding the Impact of Data Issues

A recent engagement I had with a financial institution made me realise that many organisations have issues regarding their data but are lacking the ability to express these issues in terms that exposes the real scale of the problem.

Without being able to express the impact of a problem how can an organisation know:

  1. Whether a perceived problem is real or not?
  2. If it is a real problem, how big is it?
  3. And if it is a big problem, is it big enough to need to spend time and money fixing it?
  4. Will it require more time and money to fix the problem compared to the size of the problem itself?

To put this into a real context, a senior executive of a Financial intuition once said to his team:

  • “If you tell me you have a data problem, and it doesn’t have a $ sign in front of it, then you don’t have a real problem”

I think what he was saying was:

  • Can you define the problem?
  • Can you quantify the size of the problem?
  • Do you understand the real impact of the problem?
  • Do you understand what’s required to fix the problem?
  • Can you fix all of it and if not, how much can you fix?

The result of this conversation made his team go away and do some analysis on the problems they felt they had. When they returned the number of problems had reduced and, of the ones remaining, they could explain the impact to the business. These remaining ones were now real problems as their impact was measurable. The senior executive gave these problems much more attention.

Back to my recent engagement. I had asked the organisation to share data challenges from across their business and, for each challenge, express the impact it had on their business. Everybody was able to share their data challenges although very few were able to explain the impact upon their organisation. An example of a dialogue went something like this:


Speaker Dialogue
Andy What is your data challenge?
Customer We have a critical data field missing in the source data for a key report
Andy Which data field?
Customer BusinessType indicator
Andy And what is the impact of not having this data?
Customer Err – what do you mean?
Andy Well, to start with what does the missing data mean you need to do to provide it?
Customer We have to go and look at some other reports to work it out
Andy Work it out?
Customer We know it’s derived from a combination and interpretation of a number of other fields
Andy Okay, you know how to work it out yourselves or is there a standard method for doing that?
Customer We know how to work it out so we do it ourselves
Andy Hmm, and then what do you do?
Customer We update the original data and then use it in our reports. But we’re good corproate citizens as we also update the sources systems with this newly derived data
Andy So let me summarise this using a series of ‘impact statements’:

  • that it takes you time to spot the data is missing
  • it takes you time to find the data in the other reports and derive the values
  • you created a risk associated with your deriving of the new value and hope you’ve gotten it right
  • you’ve probably caused somebody else a whole load of work to fix the data governance problem you just created by deriving the value yourself
  • if this value is used for any kind of regulatory reporting you’ve created a risk that your compliance reports might not be correct
  • you’ve broken the lineage of your data governance processes causing somebody else a whole load of work to recreate it manually
  • and you’ve bypassed any source system business rules to derive this new data so if it’s incorrect others will be using it without knowing it might not be right causing lots of manual rework

Did I understand that correctly?

Customer Err….


Now this wasn’t the end of the conversation but what it did do was make the Customer realise that something as simple as some missing data has some very direct, and quantifiable, consequences and these can be expressed through the ‘impact statements’. What we did was explore the ways these impact statements manifest themselves and what the organisation had to do to remediate them.

Exploring ‘Impact’

Once we’d started to explore each of the impact statements in more detail, the customer began to realise that by asking the right questions of other team members they started to get some visibility into the scale of the impact. A follow on dialogue went something like this:


Speaker Dialogue
Andy For the first impact statement, how long does it take you to spot that the BusinessType data is missing?
Customer Once a report fails to run because the data is missing, we get a member of staff to physically look at it and check to see if the data is really missing or whether the report stopped for some other reason
Andy And how long does that process take and for how many people?
Customer Err – I think it takes around 30 minutes for one person to do this
Andy And how often does this happen?
Customer Err – 3 or 4 times a day
Andy So you’re spending between 1.5 and 2 hours per day for one person to do this?
Customer Err – I guess so
Andy Do you know the fully loaded rate for the staff members who do this?
Customer Fully loaded rate?
Andy The total cost of a member of staff including salary, any bonuses, any benefits, their portion of heating & lighting, the cost of paid annual leave etc.
Customer I think I can go to HR and get that information
Andy Great – I think we can calculate the impact of having to spot that the BusinessType data is missing as:

  • (amount of time taken to spot the issue) * (number of times the issue occurs) * (number of people required) * (fully loaded rate for a staff member)

So if I can fix that problem, we’ll know exactly how big a problem it really is and whether it’s worth spending the time and money to fix it.

Did I understand that correctly?

Customer Err – I guess so…
Andy Great – now lets look at the impact statement around how long it takes you to find the data in the other reports and derive the values….

Developing ‘Impact’

Where impact statements get really interesting is when the consequences of the impact mean a financial provision needs to be made to offset the risk associated with the impact. A follow on dialogue went something like this:


Speaker Dialogue
Andy For the impact statement regarding the risk associated with your deriving of the new value and hoping you’ve gotten it right – what are the consequences of getting it wrong?
Customer Well, for this report we allow an 8% variance on the output numbers rather than the 4% we’d have normally
Andy 8% variance – what does that mean?
Customer To cope with more variance we set aside more capital to cover the situation
Andy How much more capital?
Customer Twice as much as normal
Andy And how much do you set aside normally?
Customer $25,000
Andy So you’re setting aside an additional $25,000 to cover that fact that the data you derived might be wrong? So if I could fix that problem, you could avoid setting aside an additional $25,000 and use the money for something else?

Did I understand that correctly?

Customer YES!

In this example we’ve made a link between a specific data issue and an amount of money to cover the risk it generates. Often it’s not so easy to be able to draw these links as most Financial Institutions have very complex environments which makes drawing a direct cause and effect difficult.

Linking ‘Impact’

This is where we now need to start creating clear linkage between

  • How the data got created?
  • What happened to that data during its lifecycle?
  • How the data got consumed?
  • The consequences of any issues related to any part of this flow

This linkage is important as often the cause of an issue may be very far removed from the consequences of it. Modelling the data flow across its lifecycle is a useful technique to expose what’s really going on and to generate visibility into causes and effects.

Whilst not necessarily a simple or easy piece of work, this approach has the benefit of created visibility and transparency around causes and effects meaning it’s apparent to everybody what’s happening and what impact is being caused.

Organisations that have embarked on enterprise Data Governance programmes have a potential head start in this area as they may well have already Modelled this flow and thus reduced the work effort considerably.  For organisations not yet that advanced around Data Governance, this work creates some useful assets to help kick-start any governance initiative. Either way, there is some work effort in modelling the flow but the visibility this creates is always very valuable.

Types of Impact

You’ll have noticed that I’m only really scratching the surface of impact statement and that some are easier to quantify that others.:

  • Impact statements that relate to simple efficiency improvements are often a percentage increase/decrease which is relatively simple to quantify
    • Example: doing something 20% faster or requiring 30% less time
  • Impact statements that relate to changing a process are often measured as a change in process time driving an increase or decrease in staff and/or cycle time
    • Example: automating a data quality process reduces the cycle time of a process by a specific amount meaning more processes can be executed in the same overall time or the same number of process executions with less resources
  • Impact statement that relate to revenue or margin measures are often expressing an issue in terms that drives revenue upwards or downwards.
    • Example: Household composition data is missing which is causing a lack of net new revenue due to lack of cross-selling opportunity identification
  • Impact statements that relate to risk oriented measures are often expressing an issue in terms of whether it increases or decreases a specific risk and the impact of the increase or decrease
    • Example: incorrect data in a regulatory report increases the risk of a fine from the regulator which means provision needs to be made for fine amount as the likelihood of it occurring has gone up

So an ‘Impact’ is the ‘Consequence’ of something?

In broad terms, ‘impact’ does mean the consequence of something. It’s drawing a link between an identified issue and what is required to overcome that issue; expressed in quantifiable, numeric terms. From the examples, we can see that impact statements come in all shapes and sizes. Thinking about this in a joined up way and looking at how the data flows means we now have a useful technique to quantify the size of a data problem whilst creating a clear understanding of what is needed to overcome the issue.

Armed with this information, organisations can now begin to quantify the size of the data challenges they face and use this insight to help inform others whether there is need to invest in any remediation. It takes away the guess work about whether a problem is real or not and if it is real, how big a problem it really is. This provides a useful context to discuss the value of remediation techniques as compared the size of the original problem.

This is another example of why Financial Services organisations needs to consider data as a strategic asset and begin to describe data, and data problems, in terms everybody can clearly understand. This approach creates visibility and transparency around data and its associated value. These attributes will become ever more important as Financial Services organisations begin to use their data assets for more defensive and offensive approaches to growing their businesses, whilst staying compliant.