Tag Archives: Data Governance
Q: What was the driver for this project?
A: The initiative fell out of a procure-to-pay (P2P) initiative. We engaged a consulting firm to help centralize Accounts Payable operations. One required deliverable was an executive P2P dashboard. This dashboard would provide enterprise insights by relying on the enterprise data warehousing and business intelligence platform.
Q: What did the dashboard illustrate?
The dashboard integrated data from many sources to provide a single view of information about all of our suppliers. By visualizing this information in one place, we were able to rapidly gain operational insights. There are approximately 30,000 suppliers in the supplier master who either manufacture, or distribute, or both over 150,000 unique products.
Q: From which sources is Informatica consuming data to power the P2P dashboard?
A: There are 8 sources of data:
3 ERP Systems:
- HBOC STAR
4 Enrichment Sources:
- Dun & Bradstreet – for associating suppliers together from disparate sources.
- GDSN – Global Data Pool for helping to cleanse healthcare products.
- McKesson Pharmacy Spend – spend file from third party pharmaceutical distributor Helps capture detailed pharmacy spend which we procure from this third party.
- Office Depot Spend – spend file from third party office supply distributor. Helps capture detailed pharmacy spend.
- MedAssets – third party group purchasing organization (GPO) who provides detailed contract pricing.
Q: Did you tackle clinical scenarios first?
A: No, well we certainly have many clinical scenarios we want to explore like cost per procedure per patient we knew that we should establish a few quick, operational wins to gain traction and credibility.
Q: Great idea – capturing quick wins is certainly the way we are seeing customers have the most success in these transformative projects. Where did you start?
A: We started with supply chain cost containment; increasing pressures on healthcare organizations to reduce cost made this low hanging fruit the right place to start. There may be as much as 20% waste to be eliminated through strategic and actionable analytics.
Q: What did you discover?
A: Through the P2P dashboard, insights were gained into days to pay on invoices as well as early payment discounts and late payment penalties. With the visualization we quickly saw that we were paying a large amount of late fees. With this awareness, we dug into why the late fees were so high. What was discovered is that, with one large supplier, the original payment terms were net 30 but that in later negotiations terms were changed to 20 days. Late fees were accruing after 20 days. Through this complete view we were able to rapidly hone in on the issue and change operations — avoiding costly late fees.
Q: That’s a great example of straight forward analytics powered by an integrated view of data, thank you. What’s a more complex use case you plan to tackle?
A: Now that we have the systems in place along with data stewardship, we will start to focus on clinical supply chain scenarios like cost per procedure per patient. We have all of the data in one data warehouse to answer questions like – which procedures are costing the most, do procedure costs vary by clinician? By location? By supply? – and what is the outcome of each of these procedures? We always want to take the right and best action for the patient.
We were also able to identify where negotiated payment discounts were not being taken advantage of or where there were opportunities to negotiate discounts.
These insights were revealed through the dashboard and immediate value was realized the first day.
Fueling knowledge with data is helping procurement negotiate the right discounts, i.e. they can seek discounts on the most used supplies vs discounts on supplies rarely used. Think of it this way… you don’t want to get a discount on OJ and if you are buying milk.
Q: Excellent example and metaphor. Let’s talk more about stewardship, you have a data governance organization within IT that is governing supply chain?
A: No, we have a data governance team within supply chain… Supply chain staff that used to be called “content managers” now “data stewards”. They were doing the stewardship work of defining data, its use, its source, its quality before but it wasn’t a formally recognized part of their jobs… now it is. Armed with Informatica Data Director they are managing the quality of supply chain data across four domains including suppliers/vendors, locations, contracts and items. Data from each of these domains resides in our EMR, our ERP applications and in our ambulatory EMR/Practice Management application creating redundancy and manual reconciliation effort.
By adding Master Data Management (MDM) to the architecture, we were able to centralize management of master data about suppliers/vendors, items, contracts and locations, augment this data with enrichment data like that from D&B, reduce redundancy and reduce manual effort.
MDM shares this complete and accurate information with the enterprise data warehouse and we can use it to run analytics against. Having a confident, complete view of master data allows us to trust analytical insights revealed through the P2P dashboard.
Q: What lessons learned would you offer?
A: Having recognized operational value, I’d encourage health systems to focus on data driven supply chain because there are savings opportunities through easier identification of unmanaged spend.
I really enjoyed learning more about this project with valuable, tangible and nearly immediate results. I will keep you posted as the customer moves onto the next phase. If you have comments or questions, leave them here.
1. You already have data stewards.
Commonly, health systems think they can’t staff data governance such as UPMC has becauseof a lack of funding. In reality, people are already doing data governance everywhere, across your organization! You don’t have to secure headcount; you locate these people within the business, formalize data governance as part of their job, and provide them tools to improve and manage their efforts.
2. Multiple types of data stewards ensure all governance needs are being met.
Three types of data stewards were identified and tasked across the enterprise:
I. Data Steward. Create and maintain data/business definitions. Assist with defining data and mappings along with rule definition and data integrity improvement.
II. Application Steward. One steward is named per application sourcing enterprise analytics. Populate and maintain inventory, assist with data definition and prioritize data integrity issues.
III. Analytics Steward. Named for each team providing analytics. Populate and maintain inventory, reduce duplication and define rules and self-service guidelines.
3. Establish IT as an enabler.
IT, instead of taking action on data governance or being the data governor, has become anenabler of data governance by investing in and administering tools that support metadata definition and master data management.
4. Form a governance council.
UPMC formed a governance council of 29 executives—yes, that’s a big number but UPMC is a big organization. The council is clinically led. It is co-chaired by two CMIOs and includes Marketing, Strategic Planning, Finance, Human Resources, the Health Plan, and Research. The council signs off on and prioritizes policies. Decision-making must be provided from somewhere.
5. Avoid slowing progress with process.
In these still-early days, only 15 minutes of monthly council meetings are spent on policy and guidelines; discussion and direction take priority. For example, a recent agenda item was “Length of Stay.” The council agreed a single owner would coordinate across Finance, Quality and Care Management to define and document an enterprise definition for “Length of Stay.”
6. Use examples.
Struggling to get buy-in from the business about the importance of data governance? An example everyone can relate to is “Test Patient.” For years, in her business intelligence role, Terri worked with “Test Patient.” Investigation revealed that these fake patients end up in places they should not. There was no standard for creation or removal of test patients, which meant that test patients and their costs, outcomes, etc., were included in analysis and reporting that drove decisions inside and external to UPMC. The governance program created a policy for testing in production should the need arise.
7. Make governance personal through marketing.
Terri holds monthly round tables with business and clinical constituents. These have been a game changer: Once a month, for two hours, ten business invitees meet and talk about the program. Each attendee shares a data challenge, and Terri educates them on the program and illustrates how the program will address each challenge.
8. Deliver self-service.
Providing self-service empowers your users to gain access and control to the data they need to improve their processes. The only way to deliver self-service business intelligence is to make metadata, master data, and data quality transparent and accessible across the enterprise.
9. IT can’t do it alone.
Initially, IT was resistant to giving up control, but now the team understands that it doesn’t have the knowledge or the time to effectively do data governance alone.
10. Don’t quit!
Governance can be complicated, and it may seem like little progress is being made. Terri keeps spirits high by reminding folks that the only failure is quitting.
Getting started? Assess the data governance maturity of your organization here: http://governyourdata.com/
The title of this article may seem counterintuitive, but the reality is that the business doesn’t care about data. They care about their business processes and outcomes that generate real value for the organization. All IT professionals know there is huge value in quality data and in having it integrated and consistent across the enterprise. The challenge is how to prove the business value of data if the business doesn’t care about it. (more…)
Is this how you think about IT? Or do you think of IT in terms of the technology it deploys instead? I recently was interviewing a CIO at Fortune 50 Company about the changing role of CIOs. When I asked him about which technology issues were most important, this CIO said something that surprised me.
He said, “IT is all about data. Think about it. What we do in IT is all about the intake of data, the processing of data, the store of data, and the analyzing of data. And we need, from data, to increasingly provide the intelligence to make better decisions”.
How many view the function of the IT organization with such clarity?
This was the question that I had after hearing this CIO. And how many IT organizations view IT as really a data system? It must not be very many. Jeanne Ross from MIT CISR contends in her book that company data “one of its most important assets, is patchy, error-prone, and not up-to-date” (Enterprise Architecture as Strategy, Jeanne Ross, page 7). Jeanne contends as well that companies having a data centric view “have higher profitability, experience faster time to market, and get more value from their IT investments” (Enterprise Architecture as Strategy, Jeanne Ross, page 2).
What then do you need to do to get your data house in order?
What then should IT organizations do to move their data from something that is “patchy, error-prone, and not up-to-date” to something that is trustworthy and timely? I would contend our CIO friend had it right. We need to manage all four elements of our data process better.
1. Input Data Correctly
You need start by making sure that the data you produced is done so consistently and correctly. I liken the need here to a problem that I had with my electronic bill pay a few years ago. My bank when it changed bill payment service providers started sending my payments to a set of out of date payee addresses. This caused me to receive late fees and for my credit score to actually go down. The same kind of thing can happen to a business when there are duplicate customers or customer addresses are entered incorrectly. So much of marketing today is about increasing customer intimacy. It is hard to improve customer intimacy when you bug the same customer too much or never connect with a customer because you had a bad address for them.
2. Process Data to Produce Meaningful Results
You need to collect and manipulate data to derive meaningful information. This is largely about processing data so it results produce meaningful analysis. To do this well, you need to take out the data quality issues from the data that is produced. We want, in this step, to make data is “trustworthy” to business users.
With this, data can be consolidated into a single view of customer, financial account, etc. A CFO explained the importance of this step by saying the following:
“We often have redundancies in each system and within the chart of accounts the names and numbers can differ from system to system. And as you establish a bigger and bigger set of systems, you need to, in accounting parlance, to roll-up the charter of accounts”.
Once data is consistently put together, then you need to consolidate it so that it can be used by business users. This means that aggregates need to be created for business analysis. These should support dimensional analysis so that business users can truly answer why something happened. For finance organizations, timely aggregated data with supporting dimensional analysis enables them to establish themselves as “a business person versus a bean counting historically oriented CPA”. Having this data answers questions like the following:
- Why are sales not being achieved? Which regions or products are failing to be delivered?
- Or why is the projected income statement not in conformance with plan? Which expense categories should be we cut in order to ensure the income statement is in line with business expectation?
3. Store Data Where it is Most Appropriate
Data storage needs to be able to occur today in many ways. It can be in applications, a data warehouse, or even, a Hadoop cluster. You need here to have an overriding data architecture that considers the entire lifecycle of data. A key element of doing this well involves archiving data as it becomes inactive and protecting data across its entire lifecycle. The former can involve as well the disposing of information. And the latter requires the ability to audit, block, and dynamically mask sensitive production data to prevent unauthorized access.
4. Enable analysis including the discovery, testing, and putting of data together
Analysis today is not just about the analysis tools. It is about enabling users to discover, test, and put data together. CFOs that we have talked to say they want analysis to expose earlier potential business problems. They want, for example, to know about metrics like average selling price and gross margins by account or by product. They want as well to see when they have seasonality affects.
Increasingly, CFOs need to use this information to help predict what the future of their business will look like. CFOs say that they want at the same time to help their businesses make better decisions from data. Limiting them today from doing this are disparate systems that cannot talk to each other. CFOs complain about their enterprise hodgepodge of systems that do not talk to one another. And yet to report, CFOs need to traverse between front office to back office systems.
One CIO said to us that the end of any analysis layer should be the ability to trust data and make dependable business decisions. And once dependable data exists, business users say that they want “access to data when they need it. They want to get data when and where you need it”. One CIO likened what is need here to orchestration when he said:
“Users want to be able to self-service. They want to be able to assembly data and put it together and do it from different sources at different times. I want them to be able to have no preconceived process. I want them to be able discover data across all sources”.
So as we said at the beginning of this post, IT is all about the data. And with mobile systems of engagement, IT’s customers are wanting increasingly their data at their fingertips. This means that business users need to be able to trust that the data they use for business analysis is timely and accurate. This demands that IT organizations get better at managing their core function—data.
Competing on Analytics
The CFO Viewpoint of Data
Is Big Data Destined To Become Small And Vertical?
Big Data Why?
The Business Case for Better Data Connectivity
What is big data and why should your business care?
A few months ago, while addressing a room full of IT and business professional at an Information Governance conference, a CFO said – “… if we designed our systems today from scratch, they will look nothing like the environment we own.” He went on to elaborate that they arrived there by layering thousands of good and valid decisions on top of one another.
Similarly, Information Governance has also evolved out of the good work that was done by those who preceded us. These items evolve into something that only a few can envision today. Along the way, technology evolved and changed the way we interact with data to manage our daily tasks. What started as good engineering practices for mainframes gave way to data management.
Then, with technological advances, we encountered new problems, introduced new tasks and disciplines, and created Information Governance in the process. We were standing on the shoulders of data management, armed with new solutions to new problems. Now we face the four Vs of big data and each of those new data system characteristics have introduced a new set of challenges driving the need for Big Data Information Governance as a response to changing velocity, volume, veracity, and variety.
Before I answer this question, I must ask you “How comprehensive is the framework you are using today and how well does it scale to address the new challenges?”
While there are several frameworks out in the marketplace to choose from. In this blog, I will tell you what questions you need to ask yourself before replacing your old framework with a new one:
Q. Is it nimble?
The focus of data governance practices must allow for nimble responses to changes in technology, customer needs, and internal processes. The organization must be able to respond to emergent technology.
Q. Will it enable you to apply policies and regulations to data brought into the organization by a person or process?
- Public company: Meet the obligation to protect the investment of the shareholders and manage risk while creating value.
- Private company: Meet privacy laws even if financial regulations are not applicable.
- Fulfill the obligations of external regulations from international, national, regional, and local governments.
Q. How does it Manage quality?
For big data, the data must be fit for purpose; context might need to be hypothesized for evaluation. Quality does not imply cleansing activities, which might mask the results.
Q. Does it understanding your complete business and information flow?
Attribution and lineage are very important in big data. Knowing what is the source and what is the destination is crucial in validating analytics results as fit for purpose.
Q. How does it understanding the language that you use, and can the framework manage it actively to reduce ambiguity, redundancy, and inconsistency?
Big data might not have a logical data model, so any structured data should be mapped to the enterprise model. Big data still has context and thus modeling becomes increasingly important to creating knowledge and understanding. The definitions evolve over time and the enterprise must plan to manage the shifting meaning.
Q. Does it manage classification?
It is critical for the business/steward to classify the overall source and the contents within as soon as it is brought in by its owner to support of information lifecycle management, access control, and regulatory compliance.
Q. How does it protect data quality and access?
Your information protection must not be compromised for the sake of expediency, convenience, or deadlines. Protect not just what you bring in, but what you join/link it to, and what you derive. Your customers will fault you for failing to protect them from malicious links. The enterprise must formulate the strategy to deal with more data, longer retention periods, more data subject to experimentation, and less process around it, all while trying to derive more value over longer periods.
Q. Does it foster stewardship?
Ensuring the appropriate use and reuse of data requires the action of an employee. E.g., this role cannot be automated, and it requires the active involvement of a member of the business organization to serve as the steward over the data element or source.
Q. Does it manage long-term requirements?
Policies and standards are the mechanism by which management communicates their long-range business requirements. They are essential to an effective governance program.
Q. How does it manage feedback?
As a companion to policies and standards, an escalation and exception process enables communication throughout the organization when policies and standards conflict with new business requirements. It forms the core process to drive improvements to the policy and standard documents.
Q. Does it Foster innovation?
Governance must not squelch innovation. Governance can and should make accommodations for new ideas and growth. This is managed through management of the infrastructure environments as part of the architecture.
Q. How does it control third-party content?
Third-party data plays an expanding role in big data. There are three types and governance controls must be adequate for the circumstances. They must consider applicable regulations for the operating geographic regions; therefore, you must understand and manage those obligations.
Do We Really Need Another Information Framework?
The EIM Consortium is a group of nine companies that formed this year with the mission to:
“Promote the adoption of Enterprise Information Management as a business function by establishing an open industry reference architecture in order to protect and optimize the business value derived from data assets.”
That sounds nice, but we do really need another framework for EIM or Data Governance? Yes we do, and here’s why. (more…)
Recently, I had the opportunity to talk to a number of CFOs about their technology priorities. These discussions represent an opportunity for CIOs to hear what their most critical stakeholder considers important. The CFOs did not hesitate or need to think much about this question. They said three things make their priority list. They are better financial system reliability, better application integration, and better data security and governance. The top two match well with a recent KPMG study which found the biggest improvement finance executives want to see—cited by 91% of survey respondents—is in the quality of financial and performance insight obtained from the data they produce, followed closely by the finance and accounting organization’s ability to proactively analyze that information before it is stale or out of date”
CFOs want to know that their systems work and are reliable. They want the data collected from their systems to be analyzed in a timely fashion. Importantly, CFOs say they are worried not only about the timeliness of accounting and financial data. This is because they increasingly need to manage upward with information. For this reason, they want timely, accurate information produced for financial and business decision makers. Their goal is to drive out better enterprise decision making.
In manufacturing, for example, CFOs say they want data to span from the manufacturing systems to the distribution system. They want to be able to push a button and get a report. These CFOs complain today about the need to manually massage and integrate data from system after system before they get what they and their business decision makers want and need.
CFOs really feel the pain of systems not talking to each other. CFOs know firsthand that they have “disparate systems” and that too much manual integration is going on. For them, they see firsthand the difficulties in connecting data from the frontend to backend systems. They personally feel the large number of manual steps required to pull data. They want their consolidation of account information to be less manual and to be more timely. One CFO said that “he wants the integration of the right systems to provide the right information to be done so they have the right information to manage and make decisions at the right time”.
Data Security and Governance
CFOs, at the same time, say they have become more worried about data security and governance. Even though CFOs believe that security is the job of the CIO and their CISO, they have an important role to play in data governance. CFOs say they are really worried about getting hacked. One CFO told me that he needs to know that systems are always working properly. Security of data matters today to CFOs for two reasons. First, data has a clear material impact. Just take a look at the out of pocket and revenue losses coming from the breach at Target. Second, CFOs, which were already being audited for technology and system compliance, feel that their audit firms will be obligated to extend what they were doing in security and governance and go as a part of regular compliance audits. One CFO put it this way. “This is a whole new direction for us. Target scared a lot of folks and will be to many respects a watershed event for CFOs”.
So the message here is that CFOs prioritize three technology objectives for their CIOs– better IT reliability, better application integration, and improved data security and governance. Each of these represents an opportunity to make the CFOs life easier but more important to enable them to take on a more strategic role. The CFOs, that we talked to, want to become one of the top three decision makers in the enterprise. Fixing these things for CFOs will enable CIOs to build a closer CFO and business relationships.
Solution Brief: The Intelligent Data Platform
Solution Brief: Secure at Source
In my last blog I promised I would report back my experience on using Informatica Data Quality, a software tool that helps automate the hectic, tedious data plumbing task, a task that routinely consumes more than 80% of the analyst time. Today, I am happy to share what I’ve learned in the past couple of months.
But first, let me confess something. The reason it took me so long to get here was that I was dreaded by trying the software. Never a savvy computer programmer, I was convinced that I would not be technical enough to master the tool and it would turn into a lengthy learning experience. The mental barrier dragged me down for a couple of months and I finally bit the bullet and got my hands on the software. I am happy to report that my fear was truly unnecessary – It took me one half day to get a good handle on most features in the Analyst Tool, a component of the Data Quality designed for analyst and business users, then I spent 3 days trying to figure out how to maneuver the Developer Tool, another key piece of the Data Quality offering mostly used by – you guessed it, developers and technical users. I have to admit that I am no master of the Developer Tool after 3 days of wrestling with it, but, I got the basics and more importantly, my hands-on interaction with the entire software helped me understand the logic behind the overall design, and see for myself how analyst and business user can easily collaborate with their IT counterpart within our Data Quality environment.
To break it all down, first comes to Profiling. As analyst we understand too well the importance of profiling as it provides an anatomy of the raw data we collected. In many cases, it is a must have first step in data preparation (especially when our raw data came from different places and can also carry different formats). A heavy user of Excel, I used to rely on all the tricks available in the spreadsheet to gain visibility of my data. I would filter, sort, build pivot table, make charts to learn what’s in my raw data. Depending on how many columns in my data set, it could take hours, sometimes days just to figure out whether the data I received was any good at all, and how good it was.
Switching to the Analyst Tool in Data Quality, learning my raw data becomes a task of a few clicks – maximum 6 if I am picky about how I want it to be done. Basically I load my data, click on a couple of options, and let the software do the rest. A few seconds later I am able to visualize the statistics of the data fields I choose to examine, I can also measure the quality of the raw data by using Scorecard feature in the software. No more fiddling with spreadsheet and staring at busy rows and columns. Take a look at the above screenshots and let me know your preference?
Once I decide that my raw data is adequate enough to use after the profiling, I still need to clean up the nonsense in it before performing any analysis work, otherwise bad things can happen — we call it garbage in garbage out. Again, to clean and standardize my data, Excel came to rescue in the past. I would play with different functions and learn new ones, write macro or simply do it by hand. It was tedious but worked if I worked on static data set. Problem however, was when I needed to incorporate new data sources in a different format, many of the previously built formula would break loose and become inapplicable. I would have to start all over again. Spreadsheet tricks simply don’t scale in those situation.
With Data Quality Analyst Tool, I can use the Rule Builder to create a set of logical rules in hierarchical manner based on my objectives, and test those rules to see the immediate results. The nice thing is, those rules are not subject to data format, location, or size, so I can reuse them when the new data comes in. Profiling can be done at any time so I can re-examine my data after applying the rules, as many times as I like. Once I am satisfied with the rules, they will be passed on to my peers in IT so they can create executable rules based on the logic I create and run them automatically in production. No more worrying about the difference in format, volume or other discrepancies in the data sets, all the complexity is taken care of by the software, and all I need to do is to build meaningful rules to transform the data to the appropriate condition so I can have good quality data to work with for my analysis. Best part? I can do all of the above without hassling my IT – feeling empowered is awesome!
Use the right tool for the right job will improve our results, save us time, and make our jobs much more enjoyable. For me, no more Excel for data cleansing after trying our Data Quality software, because now I can get a more done in less time, and I am no longer stressed out by the lengthy process.
I encourage my analyst friends to try Informatica Data Quality, or at least the Analyst Tool in it. If you are like me, feeling weary about the steep learning curve then fear no more. Besides, if Data Quality can cut down your data cleansing time by half (mind you our customers have reported higher numbers), how many more predictive models you can build, how much you will learn, and how much faster you can build your reports in Tableau, with more confidence?