Understanding the Impact of Data Issues
Financial institutions need to be able to express the impact of data issues to fully understand how big an issue they really are. Let me explain.
What do we mean by ‘Impact’?
A recent engagement I had with a financial institution made me realise that many organisations have issues regarding their data but are lacking the ability to express these issues in terms that exposes the real scale of the problem.
Without being able to express the impact of a problem how can an organisation know:
- Whether a perceived problem is real or not?
- If it is a real problem, how big is it?
- And if it is a big problem, is it big enough to need to spend time and money fixing it?
- Will it require more time and money to fix the problem compared to the size of the problem itself?
To put this into a real context, a senior executive of a Financial intuition once said to his team:
- “If you tell me you have a data problem, and it doesn’t have a $ sign in front of it, then you don’t have a real problem”
I think what he was saying was:
- Can you define the problem?
- Can you quantify the size of the problem?
- Do you understand the real impact of the problem?
- Do you understand what’s required to fix the problem?
- Can you fix all of it and if not, how much can you fix?
The result of this conversation made his team go away and do some analysis on the problems they felt they had. When they returned the number of problems had reduced and, of the ones remaining, they could explain the impact to the business. These remaining ones were now real problems as their impact was measurable. The senior executive gave these problems much more attention.
Back to my recent engagement. I had asked the organisation to share data challenges from across their business and, for each challenge, express the impact it had on their business. Everybody was able to share their data challenges although very few were able to explain the impact upon their organisation. An example of a dialogue went something like this:
|Andy||What is your data challenge?|
|Customer||We have a critical data field missing in the source data for a key report|
|Andy||Which data field?|
|Andy||And what is the impact of not having this data?|
|Customer||Err – what do you mean?|
|Andy||Well, to start with what does the missing data mean you need to do to provide it?|
|Customer||We have to go and look at some other reports to work it out|
|Andy||Work it out?|
|Customer||We know it’s derived from a combination and interpretation of a number of other fields|
|Andy||Okay, you know how to work it out yourselves or is there a standard method for doing that?|
|Customer||We know how to work it out so we do it ourselves|
|Andy||Hmm, and then what do you do?|
|Customer||We update the original data and then use it in our reports. But we’re good corproate citizens as we also update the sources systems with this newly derived data|
|Andy||So let me summarise this using a series of ‘impact statements’:
Did I understand that correctly?
Now this wasn’t the end of the conversation but what it did do was make the Customer realise that something as simple as some missing data has some very direct, and quantifiable, consequences and these can be expressed through the ‘impact statements’. What we did was explore the ways these impact statements manifest themselves and what the organisation had to do to remediate them.
Once we’d started to explore each of the impact statements in more detail, the customer began to realise that by asking the right questions of other team members they started to get some visibility into the scale of the impact. A follow on dialogue went something like this:
|Andy||For the first impact statement, how long does it take you to spot that the BusinessType data is missing?|
|Customer||Once a report fails to run because the data is missing, we get a member of staff to physically look at it and check to see if the data is really missing or whether the report stopped for some other reason|
|Andy||And how long does that process take and for how many people?|
|Customer||Err – I think it takes around 30 minutes for one person to do this|
|Andy||And how often does this happen?|
|Customer||Err – 3 or 4 times a day|
|Andy||So you’re spending between 1.5 and 2 hours per day for one person to do this?|
|Customer||Err – I guess so|
|Andy||Do you know the fully loaded rate for the staff members who do this?|
|Customer||Fully loaded rate?|
|Andy||The total cost of a member of staff including salary, any bonuses, any benefits, their portion of heating & lighting, the cost of paid annual leave etc.|
|Customer||I think I can go to HR and get that information|
|Andy||Great – I think we can calculate the impact of having to spot that the BusinessType data is missing as:
So if I can fix that problem, we’ll know exactly how big a problem it really is and whether it’s worth spending the time and money to fix it.
Did I understand that correctly?
|Customer||Err – I guess so…|
|Andy||Great – now lets look at the impact statement around how long it takes you to find the data in the other reports and derive the values….|
Where impact statements get really interesting is when the consequences of the impact mean a financial provision needs to be made to offset the risk associated with the impact. A follow on dialogue went something like this:
|Andy||For the impact statement regarding the risk associated with your deriving of the new value and hoping you’ve gotten it right – what are the consequences of getting it wrong?|
|Customer||Well, for this report we allow an 8% variance on the output numbers rather than the 4% we’d have normally|
|Andy||8% variance – what does that mean?|
|Customer||To cope with more variance we set aside more capital to cover the situation|
|Andy||How much more capital?|
|Customer||Twice as much as normal|
|Andy||And how much do you set aside normally?|
|Andy||So you’re setting aside an additional $25,000 to cover that fact that the data you derived might be wrong? So if I could fix that problem, you could avoid setting aside an additional $25,000 and use the money for something else?
Did I understand that correctly?
In this example we’ve made a link between a specific data issue and an amount of money to cover the risk it generates. Often it’s not so easy to be able to draw these links as most Financial Institutions have very complex environments which makes drawing a direct cause and effect difficult.
This is where we now need to start creating clear linkage between
- How the data got created?
- What happened to that data during its lifecycle?
- How the data got consumed?
- The consequences of any issues related to any part of this flow
This linkage is important as often the cause of an issue may be very far removed from the consequences of it. Modelling the data flow across its lifecycle is a useful technique to expose what’s really going on and to generate visibility into causes and effects.
Whilst not necessarily a simple or easy piece of work, this approach has the benefit of created visibility and transparency around causes and effects meaning it’s apparent to everybody what’s happening and what impact is being caused.
Organisations that have embarked on enterprise Data Governance programmes have a potential head start in this area as they may well have already Modelled this flow and thus reduced the work effort considerably. For organisations not yet that advanced around Data Governance, this work creates some useful assets to help kick-start any governance initiative. Either way, there is some work effort in modelling the flow but the visibility this creates is always very valuable.
Types of Impact
You’ll have noticed that I’m only really scratching the surface of impact statement and that some are easier to quantify that others.:
- Impact statements that relate to simple efficiency improvements are often a percentage increase/decrease which is relatively simple to quantify
- Example: doing something 20% faster or requiring 30% less time
- Impact statements that relate to changing a process are often measured as a change in process time driving an increase or decrease in staff and/or cycle time
- Example: automating a data quality process reduces the cycle time of a process by a specific amount meaning more processes can be executed in the same overall time or the same number of process executions with less resources
- Impact statement that relate to revenue or margin measures are often expressing an issue in terms that drives revenue upwards or downwards.
- Example: Household composition data is missing which is causing a lack of net new revenue due to lack of cross-selling opportunity identification
- Impact statements that relate to risk oriented measures are often expressing an issue in terms of whether it increases or decreases a specific risk and the impact of the increase or decrease
- Example: incorrect data in a regulatory report increases the risk of a fine from the regulator which means provision needs to be made for fine amount as the likelihood of it occurring has gone up
So an ‘Impact’ is the ‘Consequence’ of something?
In broad terms, ‘impact’ does mean the consequence of something. It’s drawing a link between an identified issue and what is required to overcome that issue; expressed in quantifiable, numeric terms. From the examples, we can see that impact statements come in all shapes and sizes. Thinking about this in a joined up way and looking at how the data flows means we now have a useful technique to quantify the size of a data problem whilst creating a clear understanding of what is needed to overcome the issue.
Armed with this information, organisations can now begin to quantify the size of the data challenges they face and use this insight to help inform others whether there is need to invest in any remediation. It takes away the guess work about whether a problem is real or not and if it is real, how big a problem it really is. This provides a useful context to discuss the value of remediation techniques as compared the size of the original problem.
This is another example of why Financial Services organisations needs to consider data as a strategic asset and begin to describe data, and data problems, in terms everybody can clearly understand. This approach creates visibility and transparency around data and its associated value. These attributes will become ever more important as Financial Services organisations begin to use their data assets for more defensive and offensive approaches to growing their businesses, whilst staying compliant.