Category Archives: CIO
On Saturday, I got a call from my broadband company on my mobile phone. The sales rep pitched a great limited-time offer for new customers. I asked him whether I could take advantage of this great offer as well, even though I am an existing customer. He was surprised. “Oh, you’re an existing customer,” he said, dismissively. “No, this offer doesn’t apply to you. It’s for new customers only. Sorry.” You can imagine my annoyance.
If this company had built a solid foundation of customer data, the sales rep would have had a customer profile rich with clean, consistent, and connected information as reference. If he had visibility into my total customer relationship with his company, he’d know that I’m a loyal customer with two current service subscriptions. He’d know that my husband and I have been customers for 10 years at our current address. On top of that, he’d know we both subscribed to their services while live at separate addresses before we were married.
Unfortunately, his company didn’t arm him with the great customer data he needs to be successful. If they had, he could have taken the opportunity to offer me one of the four services I currently don’t subscribe to—or even a bundle of services. And I could have shared a very different customer experience.
Every customer interaction counts
Executives at companies of all sizes talk about being customer-centric, but it’s difficult to execute on that vision if you don’t manage your customer data like a strategic asset. If delivering seamless, integrated, and consistent customer experiences across channels and touch points is one of your top priorities, every customer interaction counts. But without knowing exactly who your customers are, you cannot begin to deliver the types of experiences that retain existing customers, grow customer relationships and spend, and attract new customers.
How would you rate your current ability to identify your customers across lines of business, channels and touch points?
Many businesses, however, have anything but an integrated and connected customer-centric view—they have a siloed and fragmented channel-centric view. In fact, sales, marketing, and call center teams often identify siloed and fragmented customer data as key obstacles preventing them from delivering great customer experiences.
According to Retail Systems Research, creating a consistent customer experience remains the most valued capability for retailers, but 55 % of those surveyed indicated their biggest inhibitor was not having a single view of the customer across channels.
Retailers are not alone. An SVP of marketing at a mortgage company admitted in an Argyle CMO Journal article that, now that his team needs to deliver consistent customer experiences across channels and touch points, they realize they are not as customer-centric as they thought they were.
Customer complexity knows no bounds
The fact is, businesses are complicated, with customer information fragmented across divisions, business units, channels, and functions.
Citrix, for instance, is bringing together valuable customer information from 4 systems. At Hyatt Hotels & Resorts, it’s about 25 systems. At MetLife, it’s 70 systems.
How many applications and systems would you estimate contain valuable customer information at your company?
Based on our experience working with customers across many industries, we know the total customer relationship allows:
- Marketing to boost response rates by better segmenting their database of contacts for personalized marketing offers.
- Sales to more efficiently and effectively cross-sell and up-sell the most relevant offers.
- Customer service teams to resolve customers’ issues immediately, instead of placing them on hold to hunt for information in a separate system.
If your marketing, sales, and customer service teams are struggling with inaccurate, inconsistent, and disconnected customer information, it is costing your company revenue, growth, and success.
Transforming customer data into total customer relationships
Informatica’s Total Customer Relationship Solution fuels business and analytical applications with clean, consistent and connected customer information, giving your marketing, sales, e-commerce and call center teams access to that elusive total customer relationship. It not only brings all the pieces of fragmented customer information together in one place where it’s centrally managed on an ongoing basis, but also:
- Reconciles customer data: Your customer information should be the same across systems, but often isn’t. Assess its accuracy, fixing and completing it as needed—for instance, in my case merging duplicate profiles under “Jakki” and “Jacqueline.”
- Reveals valuable relationships between customers: Map critical connections—Are individuals members of the same household or influencer network? Are two companies part of the same corporate hierarchy? Even link customers to personal shoppers or insurance brokers or to sales people or channel partners.
- Tracks thorough customer histories: Identify customers’ preferred locations; channels, such as stores, e-commerce, and catalogs; or channel partners.
- Validates contact information: Ensure email addresses, phone numbers, and physical addresses are complete and accurate so invoices, offers, or messages actually reach customers.
This is just the beginning. From here, imagine enriching your customer profiles with third-party data. What types of information help you better understand, sell to, and serve your customers? What are your plans for incorporating social media insights into your customer profiles? What could you do with this additional customer information that you can’t do today?
We’ve helped hundreds of companies across numerous industries build a total customer relationship view. Merrill Lynch boosted marketing campaign effectiveness by 30 percent. Citrix boosted conversion rates by 20%. A $60 billion global manufacturer improved cross-sell and up-sell success by 5%. A hospitality company boosted cross-sell and up-sell success by 60%. And Logitech increased sales across channels, including their online site, retail stores, and distributors.
Informatica’s Total Customer Relationship Solution empowers your people with confidence, knowing that they have access to the kind of great customer data that allows them to surpass customer acquisition and retention goals by providing consistent, integrated, and seamless customer experiences across channels. The end result? Great experiences that customers are inspired to share with their family and friends at dinner parties and on social media.
Do you have a terrible customer experience or great customer experience to share? If so, please share them with us and readers using the Comment option below.
Data Science should change how your businesses are run
The importance of data science is becoming more and more clear. Marc Benioff says, “I think for every company, the revolution in data science will fundamentally change how we run our business”. “There’s just a huge amount more data than ever before, our greatest challenge is making sense of that data”. He goes on to say that “we need a new generation of executives who understand how to manage and lead through data. And we also need a new generation of employees who are able to help us organize and structure our business around data”. Mark then says “when I look at the next set of technologies that we have to build at Salesforce, it is all data science based technology.” Ram Charan in his article in Fortune Magazine “says to thrive, companies—and the execs who run them, must transform into math machines” (The Algorithmic CEO, Fortune Magazine, March 2015, page 45).
With such powerful endorsements for data science, the question you may be asking is when should you hire a data scientist or two. The answer has multiple answers. I liken data science to any business research. You need to do your upfront homework for the data scientists you hire to be effective.
Create a situation analysis before you start
You need to start by defining your problem—are you losing sales, finding it takes too long to manufacturer something, less profitable than you would like to be, and the list goes on. Next, you should create a situational analysis. You want to arm your data scientists with as much information as possible to define what you want them to solve or change. Make sure that you are as concrete as possible here. Data scientists struggle when the business people that they work with are vague. As well, it is important that you indicate what kinds of business changes will be considered if the model and data deliver this results or that result.
Next you need to catalog the data that you already have which is relevant to the business problem. Without relevant data there is little that the data scientist can do to help you. With relevant data sources in hand, you need to define the range of actions that you can possibly take once a model has been created.
Be realistic about what is required
With these things in hand, it may be time to hire some data scientists. As you start your process, you need to be realistic about the difficulty of getting a top flight data scientist. Many of my customers have complained about the difficulty competing with Google and other tech startups. As important, “there is a huge variance in the quality and ability of data scientists”. (Data Science for Business, Foster Provost, O’Reilly, page 321). Once you have hired someone, you need to keep in mind that effective data science requires business and data science collaboration. As well, please know that data scientist struggle when business people don’t appreciate the effort needed to get an appropriate training data set or model evaluation procedures.
Make sure internal or external data scientists give you an effective proposal
Once Once your data scientists are in place, you should realize that a data scientist worth their salt will create a proposal back to you. As we have said, it is important that you know what kinds of things will happen if the model and data delivery this results or that result. Data scientist in turn will be able to narrow things down to a dollar impact.
Their proposal should start by sharing their understanding of the business and the data which is available. What business problems are they trying to solve? Next the data scientist may define things like whether supervised or unsupervised learning will be used. Next they should openly discuss what efforts will be involved in data preparation. They should tell you here about the values for the target variable (whose values will be predicted). They should describe next their modeling approach and whether more than one model is be evaluated and then how models will be compared and final model be selected. And finally, they should discuss how the model will be evaluated and deployed. Are there evaluation and setup metrics? Data scientists can dedicate time and resources in their proposal to determining what things are real versus expected drivers.
To make all this work, it can be a good idea for data scientist to talk in their proposal about likelihood because business people that have not been through a quantitative MBA do not understand or remember statistics. It is important as well that data scientist before they begin ask business people the so what questions if the situation analysis is inadequate.
Leading an internal analytics team
In some cases, analytical teams will be built internally. Where this occurs, it is really importantly that the analytic leader have good people skills. They need as well to be able to set expectations that people will be making decisions from data and analysis. This includes having the ability to push back when someone comes to them will a recommendation based on gut feel.
The leader needs to hire smart analysts. To keep them, they need a stimulating and supportive work environment. Tom Davenport says analysts are motivated by interesting and challenging work that allows them to utilize their highly specialize skills. Like millenials, money is nice for analysts but they are more motivated more by exciting work and having the opportunity to grow and stretch their skills. Please know that data scientists want to spend time refining analytical models rather than doing simple analyses and report generation. Most importantly they want to do important work that makes a meaningful contribution. To do this, they want to feel supported and valued but have autonomy at work. This includes the freedom to organize their work. At the same time, analysts like to work together. And they like to be surrounded by other smart and capable collogues. Make sure to treat your data scientists as a strategic resource. This means you need development plans, career plans, and performance management processes.
As we have discussed, make sure to do your homework before contracting or hiring for data scientists. Once you have done your homework, if you are an analytic leader, make sure that you create a stimulating environment. Additionally, prove the value of analytics by signing up for results that demonstrate data modeling efficacy. To do this, look here for business problems that will lead to a big difference. And finally if you need an analytics leader to emulate, look no further than Brian Cornell, the new CEO of Target.
Myles in Twitter: @MylesSuer
Enterprise IT is in a state of constant evolution. As a result, business processes and technologies become increasingly more difficult to change and more costly to keep up-to-date. The solution to this predicament is an Enterprise Architecture (EA) process that can provide a framework for an optimized IT portfolio. IT Optimization strategy should be based on a comprehensive set of architectural principles which ensure consistency and make IT more responsive, efficient, and economical.
The rationalization, standardization, and consolidation process helps organizations understand their current EA maturity level and move forward on the appropriate roadmap. As they undertake the IT Optimization journey, the IT architecture matures through several stages, leveraging IT Optimization Architecture Principles to attain each level of maturity.
Level 1: The first step involves helping a company develop its architecture vision and operating model, with attention to cost, globalization, investiture, or whatever is driving the company strategically. Once that vision is in place, enterprise architects can guide the organization through an iterative process of rationalization, consolidation, and eventually shared-services and cloud computing.
Level 2: The rationalization exercise helps an organization identify what standards to move towards as they eliminate the complexities and silos they have built up over the years, along with the specific technologies that will help them get there.
Depending on the company, Rationalization could start with a technical discussion and be IT-driven; or it could start at a business level. For example, a company might have distributed operations across the globe and desire to consolidate and standardize its business processes. That could drive change in the IT portfolio. Or a company that has gone through mergers and acquisitions might have redundant business processes to rationalize.
Rationalizing involves understanding the current state of an organization’s IT portfolio and business processes, and then mapping business capabilities to IT capabilities. This is done by developing scoring criteria to analyze the current portfolio, and ultimately by deciding on the standards that will propel the organization forward. Standards are the outcome of a rationalization exercise.
Standardized technology represents the second level of EA maturity. Organizations at this level have evolved beyond isolated independent silos. They have well-defined corporate governance and procurement policies, which yields measurable cost savings through reduced software licenses and the elimination of redundant systems and skill sets.
Level 3: Consolidation entails reducing the footprint of your IT portfolio. That could involve consolidating the number of database servers, application servers and storage devices, consolidating redundant security platforms, or adopting virtualization, grid computing, and related consolidation initiatives.
Consolidation may be a by-product of another technology transformation, or it may be the driver of these transformations. But whatever motivates the change, the key is to be in alignment with the overall business strategy. Enterprise architects understand where the business is going so they can pick the appropriate consolidation strategy.
Level 4: One of the key outcomes of a rationalization and consolidation exercise is the creation of a strategic roadmap that continually keeps IT in line with where the business is going.
Having a roadmap is especially important when you move down the path to shared services and cloud computing. For a company that has a very complex IT infrastructure and application portfolio, having a strategic roadmap helps the organization to move forward incrementally, minimizing risk, and giving the IT department every opportunity to deliver value to the business.
Speed is the top challenge facing IT today, and it’s reaching crisis proportions at many organizations. Specifically, IT needs to deliver business value at the speed that the business requires.
The challenge does not end there; This has to be accomplished without compromising cost or quality. Many people have argued that you only get two out of three on the Speed/Cost/Quality triangle, but I believe that achieving this is the central challenge facing Enterprise Architects today. Many people I talk to are looking at agile technologies, and in particular Agile Data Integration.
There have been a lot of articles written about the challenges, but it’s not all doom and gloom. Here is something you can do right now to dramatically increase the speed of your project delivery while improving cost and quality at the same time: Take a fresh look you Agile Data Integration environment and specifically at Data Virtualization. Data Virtualization offers the opportunity to simplify and speed up the data part of enterprise projects. And this is the place where more and more projects are spending 40% and more of their time. For more information and an industry perspective you can download the latest Forrester Wave report for Data Virtualization Q1 2015.
Here is a quick example of how you can use Data Virtualization technology for rapid prototyping to speed up business value delivery:
- Use data virtualization technology to present a common view of your data to your business-IT project teams.
- IT and business can collaborate in realtime to access and manage data from a wide variety of very large data sources – eliminating the long, slow cycles of passing specifications back and forth between business and IT.
- Your teams can discover, profile, and manage data using a single virtual interface that hides the complexity of the underlying data.
- By working with a virtualization layer, you are assured that your teams are using the right data and data that can by verified by linking it to a Business Glossary with clear terms, definitions, owners, and business context to reduce the chance of misunderstandings and errors.
- Leading offerings in this space include data quality and data masking tools in the interface, ensuring that you improve data quality in the process.
- Data virtualization means that your teams can be delivering in days rather than months and faster delivery means lower cost.
There has been a lot of interest in agile development, especially as it relates to data projects. Data Virtualization is a key tool to accelerate your team in this direction.
Informatica has a leading position in the Forrester report due to the productivity of the Agile Data Integration environment but also because of the integration with the rest of the Informatica platform. From an architect’s point of view it is critical to start standardizing on an enterprise data management platform. Continuing data and data tool fragmentation will only slow down future project delivery. The best way to deal with the growing complexity of both data and tools is to drive standardization within your organizations.
I recently got to talk to several senior IT leaders about their views on information governance and analytics. Participating were a telecom company, a government transportation entity, a consulting company, and a major retailer. Each shared openly in what was a free flow of ideas.
The CEO and Corporate Culture is critical to driving a fact based culture
I started this discussion by sharing the COBIT Information Life Cycle. Everyone agreed that the starting point for information governance needs to be business strategy and business processes. However, this caused an extremely interesting discussion about enterprise analytics readiness. Most said that they are in the midst of leading the proverbial horse to water—in this case the horse is the business. The CIO in the group said that he personally is all about the data and making factual decisions. But his business is not really there yet. I asked everyone at this point about the importance of culture and the CEO. Everyone agreed that the CEO is incredibly important in driving a fact based culture. Apparent, people like the new CEO of Target are in the vanguard and not the mainstream yet.
KPIs need to be business drivers
The above CIO said that too many of his managers are operationally, day-to-day focused and don’t understand the value of analytics or of predictive analytics. This CIO said that he needs to teach the business to think analytically and to understand how analytics can help drive the business as well as how to use Key Performance Indicators (KPIs). The enterprise architect in the group shared at this point that he had previously worked for a major healthcare organization. When organization was asked to determine a list of KPIs, they came back 168 KPIs. Obviously, this could not work so he explained to the business that an effective KPI must be a “driver of performance”. He stressed to the healthcare organization’s leadership the importance of having less KPIs and of having those that get produced being around business capabilities and performance drivers.
IT needs increasingly to understand their customers business models
I shared at this point that I visited a major Italian bank a few years ago. The key leadership had high definition displays that would roll by an analytic every five minutes. Everyone laughed at the absurdity of having so many KPIs. But with this said, everyone felt that they needed to get business buy in because only the business can derive the value from acting upon the data. According to this group of IT leaders, this causing them more and more to understand their customer’s business models.
Others said that they were trying to create an omni-channel view of customers. The retailer wanted to get more predictive. While Theodore Levitt said the job of marketing is to create and keep a customer. This retailer is focused on keeping and bringing back more often the customer. They want to give customers offers that use customer data that to increase sales. Much like what I described recently was happening at 58.com, eBay, and Facebook.
Most say they have limited governance maturity
We talked about where people are in their governance maturity. Even though, I wanted to gloss over this topic, the group wanted to spend time here and compare notes between each other. Most said that they were at stage 2 or 3 in in a five stage governance maturity process. One CIO said, gee does anyone ever at level 5. Like analytics, governance was being pushed forward by IT rather than the business. Nevertheless, everyone said that they are working to get data stewards defined for each business function. At this point, I asked about the elements that COBIT 5 suggests go into good governance. I shared that it should include the following four elements: 1) clear information ownership; 2) timely, correct information; 3) clear enterprise architecture and efficiency; and 4) compliance and security. Everyone felt the definition was fine but wanted specifics with each element. I referred them and you to my recent article in COBIT Focus.
CIO says they are the custodians of data only
At this point, one of the CIOs said something incredibly insightful. We are not data stewards. This has to be done by the business—IT is the custodians of the data. More specifically, we should not manage data but we should make sure what the business needs done gets done with data. Everyone agreed with this point and even reused the term, data custodians several times during the next few minutes. Debbie Lew of COBIT said just last week the same thing. According to her, “IT does not own the data. They facilitate the data”. From here, the discussion moved to security and data privacy. The retailer in the group was extremely concerned about privacy and felt that they needed masking and other data level technologies to ensure a breach minimally impacts their customers. At this point, another IT leader in the group said that it is the job of IT leadership to make sure the business does the right things in security and compliance. I shared here that one my CIO friends had said that “the CIOs at the retailers with breaches weren’t stupid—it is just hard to sell the business impact”. The CIO in the group said, we need to do risk assessments—also a big thing for COBIT 5–that get the business to say we have to invest to protect. “It is IT’s job to adequately explain the business risk”.
Is mobility a driver of better governance and analytics?
Several shared towards the end of the evening that mobility is an increasing impetus for better information governance and analytics. Mobility is driving business users and business customers to demand better information and thereby, better governance of information. Many said that a starting point for providing better information is data mastering. These attendees felt as well that data governance involves helping the business determine its relevant business capabilities and business processes. It seems that these should come naturally, but once again, IT for these organizations seems to be pushing the business across the finish line.
Blogs and Articles:
This is an age of technology disruption and digitization. Winners will be those organizations that can adapt quickly and drive business transformation on an ongoing basis.
When I first met John Schmidt Vice President of Global Integration Services at Informatica, he asked me to visualize Business Transformation as “A modern tool like the internet and Google Maps, with which planning a road trip from New York to San Francisco with a number of stops along the way to visit friends or see some sights takes just minutes. So you’re halfway through the trip and a friend calls to say he has suddenly been called out of town, you get on your mobile phone and within a few minutes, you have a new roadmap and a new plan.”
So, why is it that creating a roadmap for an enterprise initiative takes months or even years, and upon development of such a plan, it is nearly impossible to change even when new information or external events invalidate the plan? A single transformation is useful, but what you really want is the ability to transform our business on an ongoing basis. You need to be agile in planning of the transformation initiative itself. Is it even feasible to achieve a planning capability for complex enterprise initiatives that could approach the speed and agility of cross-country road-trip planning?
The short answer is YES; you can get much faster if you do three things:
First, throw out old notions of how planning in complex corporate environments is done, while keeping in mind that planning an enterprise transformation is fundamentally different than planning a focused departmental initiative.
Second, invest in tools equivalent to Google Maps for building the enterprise roadmap. Google Maps works because it leverages a database of information about roads, rules of the roads, related local services, and points of interest. In short, Google Map the enterprise, which is not as onerous as it sounds.
Third, develop a team of Enterprise Architects and planners with the skills and discipline to use the BOST™ Framework to maintain the underlying reference data about the business, its operations, the systems that support it, and the technologies that they are based on. This will provide the execution framework for your organization to deliver the data to fuel your business initiatives and digital strategy.
The results in a closer alignment of your business and IT organizations, there will be fewer errors due to communication issues, and because your business plans are linked directly to the underlying technical implementation, your business value will be delivered quicker.
This is not some “pie in the sky” theory or a futuristic dream. What you need is a tool like Google Maps for Business Transformation. The tool is the BOST™ Toolkit leverages the BOST™ Framework, which through models, elements, and associated relationships built around an underlying Metamodel, interprets enterprise processes using a 4-dimensional view driven by business, operations, systems, and technology. Informatica in collaboration with certified partners built The BOST™ Framework. It provides an Architecture-led Planning approach to for business transformation.
Benefits of Architecture-led Planning
The Architecture-led Planning approach is effective when applied with governance and oversight. The following four features describe the benefits:
Enablement of Business and IT Collaboration – Uses a common reference model to facilitate cross-functional business alignment, as well as alignment between business and IT. The model gets everyone on the same page, regardless of line of business, location, or IT function. This model explicitly and dynamically starts with business strategy and links from there to the technical implementation.
Data-driven Planning – Being able to capture data in a structured repository helps with rapid planning. A data-driven plan makes it dynamic and adaptable to changing circumstances. When the plan changes, rather than updating dozens of documents, simply apply the change to the relevant components in the enterprise model repository and all business and technical model views that reference that component update automatically.
Cross-Functional Decision Making – Cross-functional decision-making is facilitated in several ways. First, by showing interdependencies between functions, business operations, and systems, the holistic view helps each department or team to understand the big-picture and its role in the overall process. Second, the future state architectural models are based on a view of how business operations will change. This provides the foundation to determine the business value of the initiative, measure your progress, and ultimately report the achievement of the goals. Quantifiable metrics help decision makers look beyond the subjective perspectives and agree on fact-based success metrics.
Reduced Execution Risk – Reduced execution risk results from having a robust and holistic plan based on a rigorous analysis of all the dependent enterprise components in the business, operations, systems and technology view. Risk is reduced with an effective governance discipline both from a program management as well as from an architectural change perspective.
Business Transformation with Informatica
Integrated Program Planning is for organizations that need large or complex Change Management assistance. Examples of candidates for Integrated Program Planning include:
Enterprise Initiatives: Large-scale mergers or acquisitions, switching from a product-centric operating model to more customer-centric operations, restructuring channel or supplier relationships, rationalizing the company’s product or service portfolio, or streamlining end-to-end processes such as order-to-cash, procure-to-pay, hire-to-retire or customer on-boarding.
Top-level Directives: Examples include board-mandated data governance, regulatory compliance initiatives that have broad organizational impacts such as data privacy or security, or risk management initiatives.
Expanding Departmental Solutions into Enterprise Solutions: Successful solutions in specific business areas can often be scaled-up to become cross-functional enterprise-wide initiatives. For example, expanding a successful customer master data initiative in marketing to an enterprise-wide Customer Information Management solution used by sales, product development, and customer service for an Omni-channel customer experience.
|The BOST™ Framework identifies and defines enterprise capabilities. These capabilities are modularized as reconfigurable and scalable business services. These enterprise capabilities are independent of organizational silos and politics, which provide strategists, architects, and planners the means to drive for high performance across the enterprise, regardless of the shifting set of strategic business drivers.The BOST™ Toolkit facilitates building and implementing new or improved capabilities, adjusting business volumes, and integrating with new partners or acquisitions through common views of these building blocks and through reusing solution components. In other words, Better, Faster, Cheaper projects.
The BOST™ View creates a visual understanding of the relationship between business functions, data, and systems. It helps with the identification of relevant operational capabilities and underlying support systems that need to change in order to achieve the organization’s strategic objectives. The result will be a more flexible business process with greater visibility and the ability to adjust to change without error.
I won’t say I’ve seen it all; I’ve only scratched the surface in the past 15 years. Below are some of the mistakes I’ve made or fixed during this time.
MongoDB as your Big Data platform
Ask yourself, why am I picking on MongoDB? The NoSQL database most abused at this point is MongoDB, while Mongo has an aggregation framework that tastes like MapReduce and even a very poorly documented Hadoop connector, its sweet spot is as an operational database, not an analytical system.
RDBMS schema as files
You dumped each table from your RDBMS into a file and stored that on HDFS, you now plan to use Hive on it. You know that Hive is slower than RDBMS; it’ll use MapReduce even for a simple select. Next, let’s look at row sizes; you have flat files measured in single-digit kilobytes.
Hadoop does best on large sets of relatively flat data. I’m sure you can create an extract that’s more de-normalized.
Instead of creating a single Data Lake, you created a series of data ponds or a data swamp. Conway’s law has struck again; your business groups have created their own mini-repositories and data analysis processes. That doesn’t sound bad at first, but with different extracts and ways of slicing and dicing the data, you end up with different views of the data, i.e., different answers for some of the same questions.
Schema-on-read doesn’t mean, “Don’t plan at all,” but it means “Don’t plan for every question you might ask.”
Missing use cases
Vendors, to escape the constraints of departmental funding, are selling the idea of the data lake. The byproduct of this is the business lost sight of real use cases. The data-lake approach can be valid, but you won’t get much out of it if you don’t have actual use cases in mind.
It isn’t hard to come up with use cases, but that is always an afterthought. The business should start thinking of the use cases when their databases can’t handle the load.
To do a larger bit of analytics, you may need a bigger tool set like that may include Hive, Pig, MapReduce, R, and more.
As I have shared within the posts of this series, businesses are using analytics to improve their internal and external facing business processes and to strengthen their “right to win” within the markets that they operate. Like healthcare institutions across the country, UPMC is striving to improve its quality of care and business profitability. One educational healthcare CEO put it to me this way–“if we can improve our quality of service, we can reduce costs while we increase our pricing power”. In UPMC’s case, they believe that the vast majority of their costs are in a fraction of their patients, but they want to prove this with real data and then use this information drive their go forward business strategies.
Getting more predictive to improved outcomes and reduce cost
Armed with this knowledge, UPMC’s leadership wanted to use advanced analytic and predictive modeling to improve clinical and financial decision making. And taking this action was seen as producing better patient outcomes and reducing costs. A focus area for analysis involved creating “longitudinal records” for the complete cost of providing particular types of care. For those that aren’t versed in time series analysis, longitudinal analysis uses a series of observations obtained from many respondents over time to derive a relevant business insight. When I was also involved in healthcare, I used this type of analysis to interrelate employee and patient engagement results versus healthcare outcomes. In UPMC’s case, they wanted to use this type of analysis to understand for example the total end to end cost of a spinal surgery. UPMC wanted to look beyond the cost of surgery and account for the pre-surgery care and recovery-related costs. However, to do this for the entire hospital meant that it needed to bring together data from hundreds of sources across UPMC and outside entities, including labs and pharmacies. However, by having this information, UPMC’s leadership saw the potential to create an accurate and comprehensive view which could be used to benchmark future procedures. Additionally, UPMC saw the potential to automate the creation of patient problem lists or examine clinical practice variations. But like the other case studies that we have reviewed, these steps required trustworthy and authoritative data to be accessed with agility and ease.
UPMC’s starts with a large, multiyear investment
In October 2012, UPMC made a $100 million investment to establish an enterprise analytics initiative to bring together for the first time, clinical, financial, administrative, genomic and other information together in one place. Tom Davenport, the author of Competing on Analytics, suggests in his writing that establishing an enterprise analytics capability represents a major step forward because it allows enterprises to answer the big questions, to better tie strategy and analytics, and to finally rationalize applications interconnect and business intelligence spending. As UPMC put its plan together, it realized that it needed to impact more than 1200 applications. As well it realized that it needed one system manage with data integration, master data management, and eventually complex event processing capabilities. At the same time, it created the people side of things by creating a governance team to manage data integrity improvements, ensuring that trusted data populates enterprise analytics and provides transparency into data integrity challenges. One of UPMC’s goals was to provide self-service capabilities. According to Terri Mikol, a project leader, “We can’t have people coming to IT for every information request. We’re never going to cure cancer that way.” Here is an example of the promise that occurred within the first eight months of this project. Researchers were able to integrate—for the first time ever– clinical and genomic information on 140 patients previously treated for breast cancer. Traditionally, these data have resided in separate information systems, making it difficult—if not impossible—to integrate and analyze dozens of variables. The researchers found intriguing molecular differences in the makeup of pre-menopausal vs. post-menopausal breast cancer, findings which will be further explored. For UPMC, this initial cancer insight is just the starting point of their efforts to mine massive amounts of data in the pursuit of smarter medicines.
Building the UPMC Enterprise Analytics Capability
To create their enterprise analytics platform, UPMC determined it was critical to establish “a single, unified platform for data integration, data governance, and master data management,” according to Terri Mikol. The solution required a number of key building blocks. The first was data integration to collect and cleanses data from hundreds of sources and organizes them into repositories that would enable fast, easy analysis and reporting by and for end users.
Specifically, the UPMC enterprise analytics capability pulls clinical and operational data from a broad range of sources, including systems for managing hospital admissions, emergency room operations, patient claims, health plans, electronic health records, as well as external databases that hold registries of genomic and epidemiological data needed for crafting personalized and translational medicine therapies. UPMC has integrated quality checked source data in accordance with industry-standard healthcare information models. This effort included putting together capabilities around data integration, data quality and master data management to manage transformations and enforce consistent definitions of patients, providers, facilities and medical terminology.
As said, the cleansed and harmonized data is organized into specialized genomics databases, multidimensional warehouses, and data marts. The approach makes use of traditional data warehousing approaches as well as big data capabilities to handle unstructured data and natural language processing. UPMC has also deployed analytical tools that allow end users to exploit the data enabled from the Enterprise Analytics platform. The tools drive everything from predictive analytics, cohort tracking, and business and compliance reporting. And UPMC did not stop here. If their data had value then it needed to be secured. UPMC created data audits and data governance practices. As well, they implemented a dynamic data masking solution ensures data security and privacy.
As I have discussed, many firms are pushing point silo solutions into their environments, but as UPMC shows this limits their ability to ask the bigger business questions or in UPMC’s case to discover things that can change people’s live. Analytics are more and more a business enabler if they are organized as an enterprise analytics capability. As well, I have come to believe that analytics have become foundational capability to all firms’ right to win. It informs a coherent set of capabilities and establishes a firm’s go forward right to win. For this, UPMC is a shining example of getting things right.
Author Twitter: @MylesSuer
Recently, I got to attend the Predictive Analytics Summit in San Diego. It felt great to be in a room full of data scientists from around the world—all my hidden statistics, operations research, and even modeling background came back to me instantly. I was most interested to learn what this vanguard was doing as well as any lessons learned that could be shared with the broader analytics audience. Presenters ranged from Internet leaders to more traditional companies like Scotts Miracle Gro. Brendan Hodge of Scotts Miracle Gro in fact said, as 125 year old company, he feels like “a dinosaur at a mammal convention”. So in the space that follows, I will share my key take-aways from some of the presenters.
Fei Long from 58.com
58.com is the Craigslist, Yelp, and Monster of China. Fei shared that 58.com is using predictive analytics to recommend resumes to employers and to drive more intelligent real time bidding for its products. Fei said that 58.com has 300 million users—about the number of people in the United States. Most interesting, Fei said that predictive analytics has driven a 10-20% increase in 58.com’s click through rate.
Ian Zhao from eBay
Ian said that eBay is starting to increase the footprint of its data science projects. He said that historical the focus for eBay’s data science was marketing, but today eBay is applying data science to sales and HR. Provost and Fawcett agree in “Data Science for Business” by saying that “the widest applications of data mining techniques are in marketing for tasks such as target marketing, online advertising, and recommendations for cross-selling”.
Ian said that in the non-marketing areas, they are finding a lot less data. The data is scattered across data sources, and requires a lot more cleansing. Ian is using things like time series and ARIMA to look at employee attrition. One thing that Ian found that was particularly interesting is that there is strong correlation between attrition and bonus payouts. Ian said it is critical to leave ample time for data prep. He said that it is important to start the data prep process by doing data exploration and discovery. This includes confirming that data is available for hypothesis testing. Sometimes, Ian said that this the data prep process can include inputting data that is not available in the data set and validating data summary statistics. With this, Ian said that data scientists need to dedicate time and resources for determining what things are drivers. He said with the business, data scientist should talk about likelihood because business people in general do not understand statistics. It is important as well that data scientist ask business people the so what questions. Data scientist should narrow things down to a dollar impact.
Barkha Saxena from Poshmark
Barkha is trying to model the value of user growth. Barkha said that this matters because Poshmark wants to be the #1 community driven marketplace. They want to use data to create a “personal boutique experience”. With 700,000 transactions a day, they are trying to measure customer lifetime value by implementing a cohort analysis. What was the most interesting in Barkha’s data is she discovered repeatable performance across cohorts. In their analysis, different models work better based upon the data—so a lot of time goes into procedurally determining the best model fit.
Meagan Huth from Google
Meagan said that Google is creating something that they call People Analytics. They are trying to make all people decisions by science and data. They want to make it cheaper and easier to work at Google. They have found through their research that good managers lower turnover, increase performance, and increase workplace happiness. The most interesting thing that she says they have found is the best predictor of being a good manager is being a good coach. They have developed predictive models around text threads including those that occur in employee surveys to ensure they have the data to needed to improve.
Hobson Lane from Sharp Labs
Hobson reminded everyone of the importance Nyquist (you need to sample data twice as fast as the fastest data event). This is especially important for organizations moving to the so called Internet of Things. Many of these devices have extremely large data event rates. Hobson, also, discussed the importance of looking at variance against the line that gets drawn in a regression analysis. Sometimes, multiple lines can be drawn. He, also, discussed the problem of not having enough data to support the complexity of the decision that needs to be made.
Ravi Iyer from Ranker
Ravi started by saying Ranker is a Yelp for everyone else. He then discussed the importance of have systematic data. A nice quote from him is as follows: “better data=better predictions”. Ravi discussed as well the topic of response bias. He said that asking about Coke can lead to different answer when you ask about Coke or Coke at a movie. He discussed interesting how their research shows that millennials are really all about “the best”. I see this happening every time that I take my children out to dinner—there is no longer a cheap dinner out.
Ranjan Sinha at eBay
Ranjan discussed the importance of customer centric commerce and creating predictive models around it. At eBay, they want to optimize the customer experience and improve their ability to make recommendations. eBay is finding customer expectations are changing. For this reason, they want customer context to be modeled by looking at transactions, engagement, intent, account, and inferred social behaviors. With modeling completed, they are using complex event processing to drive a more automated response to data. An amazing example given was for Valentine Day’s where they use a man’s partner’s data to predict the items that the man should get for his significant other.
Andrew Ahn from LinkedIn
Andrew is using analytics to create what he calls an economic graph and to make professionals more productive. One area that he personally is applying predictive analytics to is with LinkedIn’s sales solutions. In LinkedIn Sales Navigator, they display potential customers based upon the sales person’s demographic data—effectively the system makes lead recommendations. However, they want to de-risk this potential interaction for sale professionals and potential customers. Andrews says at the same time that they have found through data analysis that small changes in a LinkedIn profile can lead to big changes. To put this together, they have created something that they call the social selling index. It looks at predictors that they have determined are statistically relevant including member demographic, site engagement, and social network. The SSI score is viewed as a predictive index. Andrew says that they are trying to go from serendipity to data science.
Robert Wilde from Slacker Radio
Robert discussed the importance of simplicity and elegance in model building. He then went through a set of modeling issues to avoid. He said that modelers need to own the discussion of causality and cause and effect and how this can bias data interpretation. In addition, looking at data variance was stressed because what does one do when a line doesn’t have a single point fall on it. Additionally, Robert discussed what do you do when correlation is strong, weak, or mistaken. Is it X or Y that has the relationship. Or worse yet what do you do when there is coincidental correlation. This led to a discussion of forward and reverse causal inference. For this reason, Robert argued strongly for principal component analysis. This eliminates regression causational bias. At the same time, he suggested that models should be valued by complexity versus error rates.
Parsa Bakhtary from Facebook
Parsa has been looking at what games generate revenue and what games do not generate revenue for Facebook—Facebook amazingly has over 1,000 revenue bearing game. For this reason, Facebook wants to look at the Lifetime Value of Customers for Facebook Games—ithe dollar value of a relationship. Parsa said, however, there is a problem, only 20% pay for their games. Parsa argued that customer life time value (which was developed in the 1950s) doesn’t really work for apps where everyones lifetime is not the same. Additionally, social and mobile gamers are not particularly loyalty. He says that he, therefore, has to model individual games for their first 90 days across all periods of joining and then look at the cumulative revenue curves.
So we have seen here a wide variety of predictive analytics techniques being used by today’s data scientists. To me this says that predictive analytical approaches are alive and kicking. This is good news and shows that data scientists are trying to enable businesses to make better use of their data. Clearly, a key step that holds data scientist back today is data prep. While it is critical to leave ample time for data prep, it is also essential to get quality data to ensure models are working appropriately. At the same time, data prep needs to support inputting data that is not available within the original data set.
Solution Brief: Data Prep
Author Twitter: @MylesSuer