Category Archives: Data Governance
Data Governance, the art of being Regulation Ready is about a lot of things, but one thing is clear. It’s NOT just about the technology. You ever been in one of those meetings, probably more than a few, where committees and virtual teams discuss the latest corporate initiatives? You know, those meetings where you want to dip your face in lava and run into the ocean? Because at the end of the meeting, everyone goes back to their day jobs and nothing changes.
Now comes a new law or regulation from the governing body du jour. There are common threads to each and every regulation related to data. Laws like HIPAA even had entire sections dedicated to the types of filing cabinets required in the office to protect healthcare data. And the same is true of regulations like BCBS 239, CCAR reporting and Solvency II. The laws ask; what are you reporting, how did you get that data, where has it been, what does this data mean and who has touched it. Virtually all of the regulations dealing with data have those elements.
So it behooves an organization to be Regulation Ready. This means those committees and virtual teams need to be driving cultural and process change. It’s not just about the technology; it’s as much about people and processes. Every role in the organization, from the developer to the business executive should embed the concepts of data governance in their daily work. From the time a developer or architect builds a new system, they need to document and define everything and every piece of data. It reminds me of days writing code and remembering to comment each code block. And the business executive likewise is sharing business rules and definition from the top so they can be integrated into the systems that eventually have to report on it.
Finally, the processes that support a data governance program are augmented by the technology. It may seem to suffice, that systems are documented in spreadsheets and documents, but those are more and more error prone and in the end not reliable in audit.
Informatica is the market leader in data management infrastructure to be Regulation Ready. This means, everything, from data movement and quality to definitions and security. Because at the end of the day, once you have the people culturally integrated, and the processes supporting the data workload, a centralized, high performance and feature rich technology needs to be in place to complete the trifecta. Informatica is pleased to offer the industry this leading technology as part of a comprehensive data governance foundation.
Informatica will be sharing this vision at the upcoming Annual FIMA 2015 Conference in Boston from March 30 to April 1. Come and visit Informatica at FIMA 2015 in Booth #3.
I recently got to talk to several senior IT leaders about their views on information governance and analytics. Participating were a telecom company, a government transportation entity, a consulting company, and a major retailer. Each shared openly in what was a free flow of ideas.
The CEO and Corporate Culture is critical to driving a fact based culture
I started this discussion by sharing the COBIT Information Life Cycle. Everyone agreed that the starting point for information governance needs to be business strategy and business processes. However, this caused an extremely interesting discussion about enterprise analytics readiness. Most said that they are in the midst of leading the proverbial horse to water—in this case the horse is the business. The CIO in the group said that he personally is all about the data and making factual decisions. But his business is not really there yet. I asked everyone at this point about the importance of culture and the CEO. Everyone agreed that the CEO is incredibly important in driving a fact based culture. Apparent, people like the new CEO of Target are in the vanguard and not the mainstream yet.
KPIs need to be business drivers
The above CIO said that too many of his managers are operationally, day-to-day focused and don’t understand the value of analytics or of predictive analytics. This CIO said that he needs to teach the business to think analytically and to understand how analytics can help drive the business as well as how to use Key Performance Indicators (KPIs). The enterprise architect in the group shared at this point that he had previously worked for a major healthcare organization. When organization was asked to determine a list of KPIs, they came back 168 KPIs. Obviously, this could not work so he explained to the business that an effective KPI must be a “driver of performance”. He stressed to the healthcare organization’s leadership the importance of having less KPIs and of having those that get produced being around business capabilities and performance drivers.
IT needs increasingly to understand their customers business models
I shared at this point that I visited a major Italian bank a few years ago. The key leadership had high definition displays that would roll by an analytic every five minutes. Everyone laughed at the absurdity of having so many KPIs. But with this said, everyone felt that they needed to get business buy in because only the business can derive the value from acting upon the data. According to this group of IT leaders, this causing them more and more to understand their customer’s business models.
Others said that they were trying to create an omni-channel view of customers. The retailer wanted to get more predictive. While Theodore Levitt said the job of marketing is to create and keep a customer. This retailer is focused on keeping and bringing back more often the customer. They want to give customers offers that use customer data that to increase sales. Much like what I described recently was happening at 58.com, eBay, and Facebook.
Most say they have limited governance maturity
We talked about where people are in their governance maturity. Even though, I wanted to gloss over this topic, the group wanted to spend time here and compare notes between each other. Most said that they were at stage 2 or 3 in in a five stage governance maturity process. One CIO said, gee does anyone ever at level 5. Like analytics, governance was being pushed forward by IT rather than the business. Nevertheless, everyone said that they are working to get data stewards defined for each business function. At this point, I asked about the elements that COBIT 5 suggests go into good governance. I shared that it should include the following four elements: 1) clear information ownership; 2) timely, correct information; 3) clear enterprise architecture and efficiency; and 4) compliance and security. Everyone felt the definition was fine but wanted specifics with each element. I referred them and you to my recent article in COBIT Focus.
CIO says they are the custodians of data only
At this point, one of the CIOs said something incredibly insightful. We are not data stewards. This has to be done by the business—IT is the custodians of the data. More specifically, we should not manage data but we should make sure what the business needs done gets done with data. Everyone agreed with this point and even reused the term, data custodians several times during the next few minutes. Debbie Lew of COBIT said just last week the same thing. According to her, “IT does not own the data. They facilitate the data”. From here, the discussion moved to security and data privacy. The retailer in the group was extremely concerned about privacy and felt that they needed masking and other data level technologies to ensure a breach minimally impacts their customers. At this point, another IT leader in the group said that it is the job of IT leadership to make sure the business does the right things in security and compliance. I shared here that one my CIO friends had said that “the CIOs at the retailers with breaches weren’t stupid—it is just hard to sell the business impact”. The CIO in the group said, we need to do risk assessments—also a big thing for COBIT 5–that get the business to say we have to invest to protect. “It is IT’s job to adequately explain the business risk”.
Is mobility a driver of better governance and analytics?
Several shared towards the end of the evening that mobility is an increasing impetus for better information governance and analytics. Mobility is driving business users and business customers to demand better information and thereby, better governance of information. Many said that a starting point for providing better information is data mastering. These attendees felt as well that data governance involves helping the business determine its relevant business capabilities and business processes. It seems that these should come naturally, but once again, IT for these organizations seems to be pushing the business across the finish line.
Blogs and Articles:
Retail friends – sorry to say it was not a surprise that reinventing the store and making it more digital impacted the Retail Business Technology Expo (RBTE) in London this week. I saw a similar trend at the National Retail Federation Big Show back in January, which I discussed in this blog post.
With that being said, I was not shy finding the fantastic five that thrilled me in the Olympia Hammersmith center hall.
Here they are:
Engaging Spaces: Their booth was making the most noise with interactive touchable wooden walls, which emphasize interaction with sound and lights. No booth was inspiring more people taking pictures than this one. I took the liberty to record this short clip
Engaging Spaces was surrounded by lots of fancy digital signage vendors to display products in-store. Some demos did not work, or did not come with comprehensive product details and are still not personalized.
Panel: Optimizing the supply chain and omnichannel experience are twins. Moderated by Spencer Izard and completed by Craig Sears-Black from Manhattan Associates and Tom Enright from Gartner, showed that the lines between retailers and CPG companies are blurring. Retailers become eTailers and brands act like retailers.
We learned that consumers don’t care where they buy from, but they always expect trust! The experts see co-existence, overlap and changes for partnering between vendors and retailers. Analysts said that retail organizations are still siloed on the internal structure, which prevents omnichannel execution. We expect that a balance of power will take place between brands and retailers.
Orderella: Let the phone do the queuing. This app is perfect for people like me who hate waiting in line for lunch.. The app connects with PayPal, soI was able to order my snack and drink from my phone, and to my table. It was delivered in 1 minute, and I was able to monitor the process within the app. In addition, they also delivered to each both with localizing your phone and offered a 6 bucks voucher for each new deal. Great combination of location, real-time, product and customer data.
Red Ant: The seamless in-store experience. The app sits on top of ecommerce tools like Demandware, hybris, Intershop, Magento, Oxid, Oracle ATG or IBM WebSphere Commerce, which are used by many of our customers tosupport barcode scanning and flexibility in the checkout process. It also supports the in-store assistant to complete the transaction. Red Ant is very easy to use for our eCommerce clients, who already fuel their commerce with perfect product information.
Iconeme: Again for digital in-store experience. The app uses iBeacon to help users see where the product is in the store, share it, view looks (product bundles), a virtual dressing room, and of course, check out payment. Definitely something to take a look at.
I was recently talking to an executive responsible for IT infrastructure of a mid-sized bank and he mentioned that he has 17 core applications critical for business, and, over 200 ancillary applications. Each core application historically has significant changes, once a quarter and 2 small changes every week. That is about 70 big changes and 1700 small changes every year!!! Just for the core applications that are critical to business.
How does the business ensure that these perturbations to the existing systems do not break existing capabilities within the system? Worse, how does a business ensure that changes to these systems do not break other downstream applications that rely on data that is being captured in these systems? How do testing teams pick up the right test data to test all possible permutations and combinations that that these changes introduce?
Test data management solutions solve these problems with by subsetting a small set of product data to be used for testing and masking that data so that sensitive data is not exposed to a large set of users. Test data generation allows for data to be augmented for features that are currently not in production as well as introducing bad data in the environment.
As the number of applications increase many fold and the development strategies move from waterfall model to agile and continuous integration, there is a need to not only to provision test data, but provision it in such a way that it can be automated and used repeatedly. That requires a warehouse of test data that is categorized and tagged by test cases, test areas. This test data warehouse should:
- Allow testers to review test data in the warehouse, tag interesting data and data sets and be able to search on them
- In case test data is missing, augment test data
- Visualize the test data, identifying white spaces in test data and generate test data to fill the white space.
Once this warehouse of test datasets are available, testers should be able to reserve test records that are being used by them for their testing. These records can be used to restore the test environments from the test data warehouse. Testers can then run their test cases without impacting other testers who are using the same test environment.
This still leaves the question open on how do organizations test a business process that spans multiple systems. How can organizations create test data in these systems that allow a business process to be executed from one end to another? Once organizations can master this capability, they will reach their potential in automating their test processes, and reduce risk that each perturbation in the environment inevitably bring.
It is not quite a year since I have been looking into the Idenfication of Medicinal Products (IDMP) ISO standard, and the challenging EMA IDMP implementation deadline of July 1st, 2016. Together with HighPoint Solutions, we have proposed that using a Master Data Management (MDM) system as a product data integration layer is the best way to ensure the product data required by IDMP is consolidated and delivered to the regulator. This message has been well received with almost all the pharmaceutical companies we talked to.
During the past few months, the support for using MDM as a key part of the IDMP solution stack is growing. Supporters of using MDM now include people that have been looking into the IDMP challenge and solutions for far longer than me: independent consultants, representatives of pharma companies with active projects and leading analysts have expressed their support. At the IDMP Compliance Challenge and Regulatory Information Management conference held in Berlin last month, using MDM within the solution stack was a common theme – with a large percentage of presentations referencing the technology.
However, at this conference an objection to the use of MDM was circulating during the coffee break. Namely:
Do we really have time to implement MDM, and achieve compliance before July 2016?
First, let’s revisit why MDM is a favoured approach for IDMP. The data required for compliance is typically dispersed across 10+ internal and external data silos, building a data integration layer to prepare data for submission is the current favoured approach. It has popular support from both industry insiders and independent consultants. The data integration layer is seen as a good technical approach to overcome a number of challenges which IDMP is posing in their initial draft guidance, namely:
- Organisational: It has to make it easy for data owners to contribute to the quality of data
- Technical: It needs to integrate data from multiple systems, cleaning and resolving attributes using as much automation as possible
- Co-ordination: The layer must ensure data submitted is consistent across regulations, and also within internal transactional systems
- Timing: Projects must begin now, and pose low technical risk in order to meet the deadline.
MDM technology is an excellent fit to address these challenges (for a high level summary, see here).
So back to the time objection, it seems a bit out of place if you follow the logic:
- In order to comply with IDMP, you need to collect, cleanse, resolve and relate a diverse set of product data
- A data integration layer is the best technical architecture
- MDM is a proven fit for the data integration layer
So why would you not have time to implement MDM, if this is the best (and available) technical solution for the data collection and consolidation necessary to achieve compliance?
My feeling is the objection comes down to the definition of MDM, which is (correctly) seen as something more than technology. Expert and analyst definitions variously include the words ‘discipline’, ‘method’, ‘process’ or ‘practice’. Current consensus is that the underlying technology merely enables the processes which allow an organisation to collect and curate a single, trusted source of master data. In this light, MDM implies the need for senior level sponsorship and organisational change.
The truth is IDMP compliance needs senior level sponsorship and organisational change. In all the discussions I have had, these points come out clearly. Many pharma insiders who understand the challenge are grappling with how to get the required attention from execs that IDMP needs in order to achieve compliance. July 1 2016 is not only a deadline, it is the start of a new discipline in managing a broad range of pharma product data. This new discipline will require organisational change in order to ensure the high quality data can be produced on a continuous basis.
So the definitions of MDM actually make the case for using MDM as part of the technology stack stronger. MDM will not only provide technology for the data integration layer, but also a support structure for the new product data processes that will be required for sustained compliance. Without new product data management processes as part of the IDMP submission process, there will be few guarantees around data quality or lineage.
I fear that many of those with the objection that they don’t have time for MDM are really saying they don’t have enough time to implement IDMP as a new discipline, process or practice. This will expose them to the risk of non-compliance fines of up to 5% of revenue, and recurring fines of 2.5% of revenue.
In my mind, the challenge is not ‘do we have time to implement MDM?’, but rather ‘can we be successful both by and beyond July 2016, without implementing MDM?’ By MDM I am referring to the technology and the organisational aspects of creating and delivering complete and accurate data.
In honor of International Women’s Day 2015, Informatica is celebrating female leadership in a blog series. Every day this week, we will showcase a new female leader at Informatica, who will share their perspective on what it’s like to be a woman in the tech industry.
Simone Fernandes Orlandi
Latin America Marketing Director
As a woman in a leadership role, there’s a great opportunity to stand out. Exhibit your skills and knowledge in a big way! Leaders are constantly multitasking and juggling different projects at once. It’s important to learn how to do this effectively.
Advice for other women:
The same I would give to any professional: Have a focus and fight for it.
Thoughts about Informatica’s culture:
I love to work at Informatica. The products are great, the customers are happy, and the people are very friendly. We have our challenges and there’s a lot of work, but we also have the freedom to execute in a good atmosphere.
Let’s face it, building a Data Governance program is no overnight task. As one CDO puts it: ”data governance is a marathon, not a sprint”. Why? Because data governance is a complex business function that encompasses technology, people and process, all of which have to work together effectively to ensure the success of the initiative. Because of the scope of the program, Data Governance often calls for participants from different business units within an organization, and it can be disruptive at first.
Why bother then? Given that data governance is complex, disruptive, and could potentially introduce additional cost to a company? Well, the drivers for data governance can vary for different organizations. Let’s take a close look at some of the motivations behind data governance program.
For companies in heavily regulated industries, establishing a formal data governance program is a mandate. When a company is not compliant, consequences can be severe. Penalties could include hefty fines, brand damage, loss in revenue, and even potential jail time for the person who is held accountable for being noncompliance. In order to meet the on-going regulatory requirements, adhere to data security policies and standards, companies need to rely on clean, connected and trusted data to enable transparency, auditability in their reporting to meet mandatory requirements and answer critical questions from auditors. Without a dedicated data governance program in place, the compliance initiative could become an on-going nightmare for companies in the regulated industry.
A data governance program can also be established to support customer centricity initiative. To make effective cross-sells and ups-sells to your customers and grow your business, you need clear visibility into customer purchasing behaviors across multiple shopping channels and touch points. Customer’s shopping behaviors and their attributes are captured by the data, therefore, to gain thorough understanding of your customers and boost your sales, a holistic Data Governance program is essential.
Other reasons for companies to start a data governance program include improving efficiency and reducing operational cost, supporting better analytics and driving more innovations. As long as it’s a business critical area and data is at the core of the process, and the business case is loud and sound, then there is a compelling reason for launching a data governance program.
Now that we have identified the drivers for data governance, how do we start? This rather loaded question really gets into the details of the implementation. A few critical elements come to consideration including: identifying and establishing various task forces such as steering committee, data governance team and business sponsors; identifying roles and responsibilities for the stakeholders involved in the program; defining metrics for tracking the results. And soon you will find that on top of everything, communications, communications and more communications is probably the most important tactic of all for driving the initial success of the program.
A rule of thumb? Start small, take one-step at a time and focus on producing something tangible.
Sounds easy, right? Well, let’s hear what the real-world practitioners have to say. Join us at this Informatica webinar to hear Michael Wodzinski, Director of Information Architecture, Lisa Bemis, Director of Master Data, Fabian Torres, Director of Project Management from Houghton Mifflin Harcourt, global leader in publishing, as well as David Lyle, VP of product strategy from Informatica to discuss how to implement a successful data governance practice that brings business impact to an enterprise organization.
If you are currently kicking the tires on setting up data governance practice in your organization, I’d like to invite you to visit a member-only website dedicated to Data Governance: http://governyourdata.com/. This site currently has over 1,000 members and is designed to foster open communications on everything data governance. There you will find conversations on best practices, methodologies, frame works, tools and metrics. I would also encourage you to take a data governance maturity assessment to see where you currently stand on the data governance maturity curve, and compare the result against industry benchmark. More than 200 members have taken the assessment to gain better understanding of their current data governance program, so why not give it a shot?
Data Governance is a journey, likely a never-ending one. We wish you best of the luck on this effort and a joyful ride! We love to hear your stories.
I’ve spent most of my career working with new technology, most recently helping companies make sense of mountains of incoming data. This means, as I like to tell people, that I have the sexiest job in the 21st century.
Harvard Business Review put the data scientist into the national spotlight in their publication Data Scientist: The Sexiest Job of the 21st Century. Job trends data from Indeed.com confirms the rise in popularity for the position, showing that the number of job postings for data scientist positions increased by 15,000%.
In the meantime, the role of data scientist has changed dramatically. Data used to reside on the fringes of the operation. It was usually important but seldom vital – a dreary task reserved for the geekiest of the geeks. It supported every function but never seemed to lead them. Even the executives who respected it never quite absorbed it.
For every Big Data problem, the solution often rests on the shoulders of a data scientist. The role of the data scientist is similar in responsibility to the Wall Street “quants” of the 80s and 90s – now, these data experienced are tasked with the management of databases previously thought too hard to handle, and too unstructured to derive any value.
So, is it the sexiest job of the 21st Century?
Think of a data scientist more like the business analyst-plus, part mathematician, part business strategist, these statistical savants are able to apply their background in mathematics to help companies tame their data dragons. But these individuals aren’t just math geeks, per se.
A data scientist is somebody who is inquisitive, who can stare at data and spot trends. It’s almost like a renaissance individual who really wants to learn and bring change to an organization.
If this sounds like you, the good news is demand for data scientists is far outstripping supply. Nonetheless, with the rising popularity of the data scientist – not to mention the companies that are hiring for these positions – you have to be at the top of your field to get the jobs.
Companies look to build teams around data scientists that ask the most questions about:
- How the business works
- How it collects its data
- How it intends to use this data
- What it hopes to achieve from these analyses
These questions were important because data scientists will often unearth information that can “reshape an entire company.” Obtaining a better understanding of the business’ underpinnings not only directs the data scientist’s research, but helps them present the findings and communicate with the less-analytical executives within the organization.
While it’s important to understand your own business, learning about the successes of other corporations will help a data scientist in their current job–and the next.
As we head into Strata + Hadoop World San Jose, Pivotal has made some interesting announcements that are sure to be the talk of the show. Pivotal’s move to open-source some of their advanced products (and to form a new organization to foster Hadoop community cooperation) are signs of the dynamism and momentum of the Big Data market.
Informatica applauds these initiatives by Pivotal and we hope that they will contribute to the accelerating maturity of Hadoop and its expansion beyond early adopters into mainstream industry adoption. By contributing HAWQ, GemFire and the Greenplum Database to the open source community, Pivotal creates further open options in the evolving Hadoop data infrastructure technology. We expect this to be well received by the open source community.
As Informatica has long served as the industry’s neutral data connector for more than 5,500 customers and have developed a rich set of capabilities for Hadoop, we are also excited to see efforts to try to reduce fragmentation in the Hadoop community.
Even before the new company Pivotal was formed, Informatica had a long history working with the Greenplum team to ensure that joint customers could confidently use Informatica tools to include the Greenplum Database in their enterprise data pipelines. Informatica has mature and high-performance native connectivity to load data in and out of Greenplum reliably using Informatica’s codeless, visual data pipelining tools. In 2014, Informatica expanded out Hadoop support to include Pivotal HD Hadoop and we have joint customers using Informatica to do data profiling, transformation, parsing and cleansing using Informatica Big Data Edition running on Pivotal HD Hadoop.
We expect these innovative developments driven by Pivotal in the Big Data technology landscape to help to move the industry forward and contribute to Pivotal’s market progress. We look forward to continuing to support Pivotal technology and to an ever increasing number of successful joint customers. Please reach out to us if you have any questions about how Informatica and Pivotal can help your organization to put Big Data into production. We want to ensure that we can help you answer the question … Are you Big Data Ready?
In our house when we paint a room, my husband does the big rolling of the walls or ceiling, I do the cut-in work. I am good at prepping the room, taping all the trim and deliberately painting the corners. However, I am thrifty and constantly concerned that we won’t have enough paint to finish a room. My husband isn’t afraid to use enough paint and is extremely efficient at painting a wall in a single even coat. As a result, I don’t do the big rolling and he doesn’t do the cutting in. It took us awhile to figure this out, and a few rooms had to be repainted while we were figuring it out. Now we know what we are good at, and what we need help with.
Payers roles are changing. Payers were previously focused on risk assessment, setting and collecting premiums, analyzing claims and making payments – all while optimizing revenues. Payers are pretty good at selling to employers, figuring out the cost/benefit ratio from an employers perspective and ensuring a good, profitable product. With the advent of the Affordable Healthcare Act along with a much more transient insured population, payers now must focus more on the individual insured and be able to communicate with the individuals in a more nimble manner than in the past.
Individual members will shop for insurance based on consumer feedback and price. They are interested in ease of enrollment and the ability to submit and substantiate claims quickly and intuitively. Payers are discovering that they need to help manage population health at a individual member level. And population health management requires less of a business-data analytics approach and more social media and gaming-style logic to understand patients. In this way, payers can help develop interventions to sustain behavioral changes for better health.
When designing such analytics, payers should consider the following key design steps:
- Extend data warehouses to an analytics appliance
- Invest in a big data platform to absorb patients’ social data
- Build predictive analytics for patient behavior
- Bridge collaborative and behavioral analytics with claims to build revenue and profitability
Due to payers’ mature predictive analytics competencies, they will have a much easier time in the next generation of population behavior compared to their provider counterparts. As clinical content is often unstructured compared to the claims data, payers need to pay extra attention to context and semantics when deciphering clinical content submitted by providers. Payers can use help from vendors that can help them understand unstructured data, individual members. They can then use that data to create fantastic predictive analytic solutions.