Category Archives: Governance, Risk and Compliance
After a careful review by Informatica, the recent Ghost buffer overflow vulnerability (CVE-2015-0235) does not require any Informatica patches for our on-premise products. All Informatica cloud-hosted services were patched by Jan 30.
What you need to know
Ghost is a buffer overflow vulnerability found in glibc (GNU C Library), most commonly found on Linux systems. All distributions of Linux are potentially affected. The most common attack vectors involve Linux servers that are hosting web apps, email servers, and other such services that accept requests over the open Internet; hackers can embed malicious code therein. Fixed versions of glibc are now already available from their respective Linux vendors, including:
- Red Hat: https://access.redhat.com/articles/1332213
What you need to do
Because many of our products link to glibc.zip, we recommend customers apply the appropriate OS patch from their Linux vendor. After applying this OS patch, customers should restart Informatica services running on that machine to ensure our software is linking to the up-to-date glibc library. To ensure all other resources on a system are patched, a full system reboot may also be necessary.
Bill Burns, VP & Chief Information Security Officer
I’ve spent most of my career working with new technology, most recently helping companies make sense of mountains of incoming data. This means, as I like to tell people, that I have the sexiest job in the 21st century.
Harvard Business Review put the data scientist into the national spotlight in their publication Data Scientist: The Sexiest Job of the 21st Century. Job trends data from Indeed.com confirms the rise in popularity for the position, showing that the number of job postings for data scientist positions increased by 15,000%.
In the meantime, the role of data scientist has changed dramatically. Data used to reside on the fringes of the operation. It was usually important but seldom vital – a dreary task reserved for the geekiest of the geeks. It supported every function but never seemed to lead them. Even the executives who respected it never quite absorbed it.
For every Big Data problem, the solution often rests on the shoulders of a data scientist. The role of the data scientist is similar in responsibility to the Wall Street “quants” of the 80s and 90s – now, these data experienced are tasked with the management of databases previously thought too hard to handle, and too unstructured to derive any value.
So, is it the sexiest job of the 21st Century?
Think of a data scientist more like the business analyst-plus, part mathematician, part business strategist, these statistical savants are able to apply their background in mathematics to help companies tame their data dragons. But these individuals aren’t just math geeks, per se.
A data scientist is somebody who is inquisitive, who can stare at data and spot trends. It’s almost like a renaissance individual who really wants to learn and bring change to an organization.
If this sounds like you, the good news is demand for data scientists is far outstripping supply. Nonetheless, with the rising popularity of the data scientist – not to mention the companies that are hiring for these positions – you have to be at the top of your field to get the jobs.
Companies look to build teams around data scientists that ask the most questions about:
- How the business works
- How it collects its data
- How it intends to use this data
- What it hopes to achieve from these analyses
These questions were important because data scientists will often unearth information that can “reshape an entire company.” Obtaining a better understanding of the business’ underpinnings not only directs the data scientist’s research, but helps them present the findings and communicate with the less-analytical executives within the organization.
While it’s important to understand your own business, learning about the successes of other corporations will help a data scientist in their current job–and the next.
In our house when we paint a room, my husband does the big rolling of the walls or ceiling, I do the cut-in work. I am good at prepping the room, taping all the trim and deliberately painting the corners. However, I am thrifty and constantly concerned that we won’t have enough paint to finish a room. My husband isn’t afraid to use enough paint and is extremely efficient at painting a wall in a single even coat. As a result, I don’t do the big rolling and he doesn’t do the cutting in. It took us awhile to figure this out, and a few rooms had to be repainted while we were figuring it out. Now we know what we are good at, and what we need help with.
Payers roles are changing. Payers were previously focused on risk assessment, setting and collecting premiums, analyzing claims and making payments – all while optimizing revenues. Payers are pretty good at selling to employers, figuring out the cost/benefit ratio from an employers perspective and ensuring a good, profitable product. With the advent of the Affordable Healthcare Act along with a much more transient insured population, payers now must focus more on the individual insured and be able to communicate with the individuals in a more nimble manner than in the past.
Individual members will shop for insurance based on consumer feedback and price. They are interested in ease of enrollment and the ability to submit and substantiate claims quickly and intuitively. Payers are discovering that they need to help manage population health at a individual member level. And population health management requires less of a business-data analytics approach and more social media and gaming-style logic to understand patients. In this way, payers can help develop interventions to sustain behavioral changes for better health.
When designing such analytics, payers should consider the following key design steps:
- Extend data warehouses to an analytics appliance
- Invest in a big data platform to absorb patients’ social data
- Build predictive analytics for patient behavior
- Bridge collaborative and behavioral analytics with claims to build revenue and profitability
Due to payers’ mature predictive analytics competencies, they will have a much easier time in the next generation of population behavior compared to their provider counterparts. As clinical content is often unstructured compared to the claims data, payers need to pay extra attention to context and semantics when deciphering clinical content submitted by providers. Payers can use help from vendors that can help them understand unstructured data, individual members. They can then use that data to create fantastic predictive analytic solutions.
This week, another reputable organization, Anthem Inc, reported it was ‘the target of a very sophisticated external cyber attack’. But rather than be upset at Anthem, I respect their responsible data breach reporting.
In this post from Joseph R. Swedish, President and CEO, Anthem, Inc., does something that I believe all CEO’s should do in this situation. He is straight up about what happened, what information was breached, actions they took to plug the security hole, and services available to those impacted.
When it comes to a data breach, the worst thing you can do is ignore it or hope it will go away. This was not the case with Anthem. Mr Swedish did the right thing and I appreciate it.
You only have one corporate reputation – and it is typically aligned with the CEO’s reputation. When the CEO talks about the details of a data breach and empathizes with those impacted, he establishes a dialogue based on transparency and accountability.
Research that tells us 44% of healthcare and pharmaceutical organizations experienced a breach in 2014. And we know that when personal information when combined with health information is worth more on the black market because the data can be used for insurance fraud. I expect more healthcare providers will be on the defensive this year and only hope that they follow Mr Swedish’s example when facing the music.
To level set, let’s make sure you understand my definition of dark data. I prefer using visualizations when I can so, picture this: the end of the first Indiana Jones movie, Raiders of the Lost Ark. In this scene, we see the Ark of the Covenant, stored in a generic container, being moved down the aisle in a massive warehouse full of other generic containers. What’s in all those containers? It’s pretty much anyone’s guess. There may be a record somewhere, but, for all intents and purposes, the materials stored in those boxes are useless.
Applying this to data, once a piece of data gets shoved into some generic container and is stored away, just like the Arc, the data becomes essentially worthless. This is dark data.
Opening up a government agency to all its dark data can have significant impacts, both positive and negative. Here are couple initial tips to get you thinking in the right direction:
- Begin with the end in mind – identify quantitative business benefits of exposing certain dark data.
- Determine what’s truly available – perform a discovery project – seek out data hidden in the corners of your agency – databases, documents, operational systems, live streams, logs, etc.
- Create an extraction plan – determine how you will get access to the data, how often does the data update, how will handle varied formats?
- Ingest the data – transform the data if needed, integrate if needed, capture as much metadata as possible (never assume you won’t need a metadata field, that’s just about the time you will be proven wrong).
- Govern the data – establish standards for quality, access controls, security protections, semantic consistency, etc. – don’t skimp here, the impact of bad data can never really be quantified.
- Store it – it’s interesting how often agencies think this is the first step
- Get the data ready to be useful to people, tools and applications – think about how to minimalize the need for users to manipulate data – reformatting, parsing, filtering, etc. – to better enable self-service.
- Make it available – at this point, the data should be easily accessible, easily discoverable, easily used by people, tools and applications.
Clearly, there’s more to shining the light on dark data than I can offer in this post. If you’d like to take the next step to learning what is possible, I suggest you download the eBook, The Dark Data Imperative.
I live in a very small town in Maine. I don’t spend a lot of time thinking about my privacy. Some would say that by living in a small town, you give up your right to privacy because everyone knows what everyone else is doing. Living here is a choice – for me to improve my family’s quality of life. Sharing all of the details of my life – not so much.
When I go to my doctor (who also happens to be a parent from my daughter’s school), I fully expect that any sort of information that I share with him, or that he obtains as a result of lab tests or interviews, or care that he provides is not available for anyone to view. On the flip side, I want researchers to be able to take my lab information combined with my health history in order to do research on the effectiveness of certain medications or treatment plans.
As a result of this dichotomy, Congress (in 1996) started to address governance regarding the transmission of this type of data. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) is a Federal law that sets national standards for how health care plans, health care clearinghouses, and most health care providers protect the privacy of a patient’s health information. With certain exceptions, the Privacy Rule protects a subset of individually identifiable health information, known as protected health information or PHI, that is held or maintained by covered entities or their business associates acting for the covered entity. PHI is any information held by a covered entity which concerns health status, provision of health care, or payment for health care that can be linked to an individual.
Many payers have this type of data in their systems (perhaps in a Claims Administration system), and have the need to share data between organizational entities. Do you know if PHI data is being shared outside of the originating system? Do you know if PHI is available to resources that have no necessity to access this information? Do you know if PHI data is being shared outside your organization?
If you can answer yes to each of these questions – fantastic. You are well ahead of the curve. If not – you need to start considering solutions that can
- Identify PHI in all of your data streams
- Monitor and track the flow of this data throughout your organization and
- Mask this data if it is being shared with resources that don’t need to be able to identify the individual.
I want to researchers to have access to medically relevant data so they can find the cures to some horrific diseases. I want to feel comfortable sharing health information with my doctor. I want to feel comfortable that my health insurance company is respecting my privacy. Now to get my kids to stop oversharing.
A few years ago the former eBay’s CISO, Dave Cullinane, led a sobering coaching discussion on how to articulate and communicate the value of a security solution and its economics to a CISO’s CxO peers.
Why would I blog about such old news? Because it was a great and timeless idea. And in this age of the ‘Great Data Breach’, where CISOs need all the help they can get, I thought I would share it with y’all.
Dave began by describing how to communicate the impact of an attack from malware such as Aurora, spearfishing, stuxnet, hacktivision, and so on… versus the investment required to prevent the attack. If you are an online retailer and your web server goes down because of a major denial of service attack, what does that cost the business? How much revenue is lost every minute that site is offline? Enough to put you out of business? See the figure below that illustrates how to approach this conversation.
If the impact of a breach and the risk of losing business is high and the investment in implementing a solution is relatively low, the investment decision is an obvious one (represented by the yellow area in the upper left corner).
However, it isn’t always this easy, is it? When determining what your company’s brand and reputation worth, how do you develop a compelling case?
Another dimension Dave described is communicating the economics of a solution that could prevent an attack based on the probability that the attack would occur (see next figure below).
For example, consider an attack that could influence stock prices? This is a complex scenario that is probably less likely to occur on a frequent basis and would require a sophisticated multidimensional solution with an integrated security analytics solution to correlate multiple events back to a single source. This might place the discussion in the middle blue box, or the ‘negotiation zone’. This is where the CISO needs to know what the CxO’s risk tolerances are and articulate value in terms of the ‘coin of the realm’.
Finally, stay on top of what the business is cooking up for new initiatives that could expose or introduce new risks. For example, is marketing looking to spin up a data warehouse on Amazon Redshift? Anyone on the analytics team tinkering with Hadoop in the cloud? Is development planning to outsource application test and development activities to offshore systems integrators? If you are participating in any of these activities, make sure your CISO isn’t the last to know when a ‘Breach Happens’!
To learn more about ways you can mitigate risk and maintain data privacy compliance, check out the latest Gartner Data Masking Magic Quadrant.
As we renew or reinvent ourselves for 2015, I wanted to share a case of “imagine if” with you and combine it with the narrative of an American frontier town out West, trying to find a new Sheriff – a Wyatt Earp. In this case the town is a legacy European communications firm and Wyatt and his brothers are the new managers – the change agents.
Here is a positive word upfront. This operator has had some success in rolling outs broadband internet and IPTV products to residential and business clients to replace its dwindling copper install base. But they are behind the curve on the wireless penetration side due to the number of smaller, agile MVNOs and two other multi-national operators with a high density of brick-and-mortar stores, excellent brand recognition and support infrastructure. Having more than a handful of brands certainly did not make this any easier for our CSP. To make matters even more challenging, price pressure is increasingly squeezing all operators in this market. The ones able to offset the high-cost Capex for spectrum acquisitions and upgrades with lower-cost Opex for running the network and maximizing subscriber profitability, will set themselves up for success (see one of my earlier posts around the same phenomenon in banking).
Not only did they run every single brand on a separate CRM and billing application (including all the various operational and analytical packages), they also ran nearly every customer-facing-service (CFS) within a brand the same dysfunctional way. In the end, they had over 60 CRM and the same number of billing applications across all copper, fiber, IPTV, SIM-only, mobile residential and business brands. Granted, this may be a quite excessive example; but nevertheless, it is relevant for many other legacy operators.
As a consequence, their projections indicate they incur over €600,000 annually in maintaining duplicate customer records (ignoring duplicate base product/offer records for now) due to excessive hardware, software and IT operations. Moreover, they have to stomach about the same amount for ongoing data quality efforts in IT and the business areas across their broadband and multi-play service segments.
Here are some more consequences they projected:
- €18.3 million in call center productivity improvement
- €790,000 improvement in profit due to reduced churn
- €2.3 million reduction in customer acquisition cost
- And if you include the fixing of duplicate and conflicting product information, add another €7.3 million in profit via billing error and discount reduction (which is inline with our findings from a prior telco engagement)
Despite major business areas not having contributed to the investigation and improvements being often on the conservative side, they projected a 14:1 return ratio between overall benefit amount and total project cost.
Coming back to the “imagine if” aspect now, one would ask how this behemoth of an organization can be fixed. Well, it will take years but without management (in this case new managers busting through the door), this organization has the chance to become the next Rocky Mountain mining ghost town.
The good news is that this operator is seeing some management changes now. The new folks have a clear understanding that business-as-usual won’t do going forward and that centralization of customer insight (which includes some data elements) has its distinct advantages. They will tackle new customer analytics, order management, operational data integration (network) and next-best-action use cases incrementally. They know they are in the data, not just the communication business. They realize they have to show a rapid succession of quick wins rather than make the organization wait a year or more for first results. They have fairly humble initial requirements to get going as a result.
You can equate this to the new Sheriff not going after the whole organization of the three, corrupt cattle barons, but just the foreman of one of them for starters. With little cost involved, the Sheriff acquires some first-hand knowledge plus he sends a message, which will likely persuade others to be more cooperative going forward.
What do you think? Is new management the only way to implement drastic changes around customer experience, profitability or at least understanding?
Happy Holidays, Happy HoliData
In case you have missed our #HappyHoliData series on Twitter and LinkedIn, I decided to provide a short summary of best practices which are unleashing information potential. Simply scroll and click on the case study which is relevant for you and your business. The series touches on different industries and use cases. But all have one thing in common: All consider information quality as key value to their business to deliver the right services or products to the right customer.
Thanks a lot to all my great teammates, who made this series happen.
Happy Holidays, Happy HoliData.
It takes a village to build mainstream big data solutions. We often get so caught up in Hadoop use cases and customer successes that sometimes we don’t talk enough about the innovative partner technologies and integrations that enable our customers to put the enterprise data hub at the core of their data architecture and innovate with confidence. Cloudera and Informatica have been working together to integrate our products to enable new levels of productivity and lower deployment and production risk.
Going from Hadoop to an enterprise data hub, means a number of things. It means that you recognize the business value of capturing and leveraging all your data for exploration and analytics. It means you’re ready to make the move from Hadoop pilot project to production. And it means your data is important enough that it’s worth securing and making data pipelines visible. It’s the visibility layer, and in particular, the unique integration between Cloudera Navigator and Informatica that I want to focus on in this post.
The era of big data has ushered in increased regulations in a number of industries – banking, retail, healthcare, energy – most of which deal in how data is managed throughout its lifecycle. Cloudera Navigator is the only native end-to-end solution for governance in Hadoop. It provides visibility for analysts to explore data in Hadoop, and enables administrators and managers to maintain a full audit history for HDFS, HBase, Hive, Impala, Spark and Sentry then run reports on data access for auditing and compliance.The integration of Informatica Metadata Manager in the Big Data Edition and Cloudera Navigator extends this level of visibility and governance beyond the enterprise data hub.
Today, only Informatica and Cloudera provide end-to-end data lineage from source systems through Hadoop, and into BI/analytic and data warehouse systems. And you can view it from a single pane within Informatica.
This is important because Hadoop, and the enterprise data hub in particular, doesn’t function in a silo. It’s an integrated part of a larger enterprise-wide data management architecture. The better the insight into where data originated, where it traveled, who had access to it and what they did with it, the greater our ability to report and audit. No other combination of technologies provides this level of audit granularity.
But more so than that, the visibility Cloudera and Informatica provides our joint customers with the ability to confidently stand up an enterprise data hub as a part of their production enterprise infrastructure because they can verify the integrity of the data that undergirds their analytics. I encourage you to check out a demo of the Informatica-Cloudera Navigator integration at this link: http://infa.media/1uBpPbT
You can also check out a demo and learn a little more about Cloudera Navigator and the Informatica integration in the recorded TechTalk hosted by Informatica at this link: