Today, 80% of the efforts in Big Data projects are related to extracting, transforming and loading data (ETL). Hortonworks and Informatica have teamed-up to leverage the power of Informatica Big Data Edition to use their existing skills to improve the efficiency of these operations and better leverage their resources in a modern data architecture. (MDA)
Next Generation Data Management
The Hortonworks Data Platform and Informatica BDE enable organizations to optimize their ETL workloads with long-term storage and processing at scale in Apache Hadoop. With Hortonworks and Informatica, you can:
• Leverage all internal and external data to achieve the full predictive power that drives the success of modern data-driven businesses.
• Optimize the entire big data supply chain on Hadoop, turning data into actionable information to drive business value.
Imagine a world where you would have access to your most strategic data in a timely fashion, no matter how old the data is, where it is stored, or under what format. By leveraging Hadoop’s power of distributed processing, organizations can lower costs of data storage and processing and support large data distribution with high through put and concurrency.
Overall, the alignment between business and IT grows. The Big Data solution based on Informatica and Hortonworks allows for a complete data pipeline to ingest, parse, integrate, cleanse, and prepare data for analysis natively on Hadoop thereby increasing developer productivity by 5x over hand-coding.
Where Do We Go From Here?
At the end of the day, Big Data is not about the technology. It is about the deep business and social transformation every organization will go through. The possibilities to make more informed decisions, identify patterns, proactively address fraud and threats, and predict pretty much anything are endless.
This transformation will happen as the technology is adopted and leveraged by more and more business users. We are already seeing the transition from 20-node clusters to 100-node clusters and from a handful of technology-savvy users relying on Hadoop to hundreds of business users. Informatica and Hortonworks are accelerating the delivery of actionable Big Data insights to business users by automating the entire data pipeline.
Try It For Yourself
On September 10, 2014, Informatica announced the 60-day trial version of the Informatica Big Data Edition into the Hortonworks Sandbox. This free trial enables you to download and test out the Big Data Edition on your notebook or spare computer and experience your own personal Modern Data Architecture (MDA).
If you happen to be at Strata this October 2014, please meet us at our booths: Informatica #352 and Hortonworks #117. Don’t forget to participate in our Passport Program and join our session at 5:45 pm ET on Thursday, October 16, 2014.
I had the honour recently of being asked to give the opening keynote presentation at Informatica’s Information Potential Day in London. It was a really well attended event and I enjoyed the discussions during the breaks and over lunch with many Informatica customers.
The presentation I gave was entitled Information Potential in the Information Age. In it I tried to get several messages over to the audience. The gist was as follows. We are now in an internet age where the power of the customer is king. They can compare prices and switch loyalties at the click of a mouse or at the touch of a screen on a mobile device. Competition for the wallet share is coming from everywhere with new web businesses rewiring some industries. Therefore it’s not surprising that the recent 16th annual survey of CEOs by PwC showed that customer growth, retention and loyalty was top of the agenda along with the need to improve operational effectiveness.
These business priorities are driving new information requirements. On the customer front there is a need for new data sources for deeper customer insight, a trusted 360° view of customer, integrated customer master data, and a customer oriented data warehouse. Operational effectiveness, on the other hand requires on-demand information services, lower latency data, event-driven data integration, stream processing and decision management as well as operational BI at the point of need. Almost all of these requirements are dependent on one thing - data integration.
Yet, having been in this industry and in data management for over 32 years now, I can’t remember a time when I have seen data so widely distributed. And it is increasing. We have multiple instances of OLTP applications in different geographies or business units, with data being sent everywhere. Also in the analytical world, more and more data warehouse appliances are appearing creating ‘islands’ of analytical data. Data is now in the cloud as well as on premise and the arrival of Big Data is adding more platforms such as NoSQL DBMSs and Hadoop into the mix. Also unstructured content is still scattered across the enterprise and with new data sources like social media data, machine generated clickstream, GPS and sensor data now upon us, we are facing a data deluge.
So while we have the potential there to act on deeper insights to improve customer growth and operational effectiveness, the only way we can produce this information is to integrate data. This need is now everywhere. We have made progress over the years in some areas. Data warehousing and master data management both require data integration. Cloud computing also needs it to get data out to cloud applications or collect it from these applications. Big data needs it too. We hand-coded data integration 20 years ago before the emergence of data integration software. My question with the advent of Big Data is why is it ‘cool’ to hand-write code again to do this just because it is on new technology platforms like Hadoop. Surely we could be more productive leveraging existing investments in data integration as long as they extend their support into the Big Data arena. Oh and before I forget we are already at the point where the business user needs data integration. The emergence of multiple data warehouses in organizations has meant that almost ever business analyst I know using a self-service BI tool to produce insights and dashboards needs to integrate data from multiple underlying data sources.
So while we have made progress, we have often implemented is data integration on a project-by-project basis whether it be for a data warehouse, for a master data entity, for cloud or for big data, let alone everything else we need it for. However, the danger here is that we miss a major opportunity to re-use metadata whereby we define data names and data integration mappings once and then use them to provision data to wherever it is needed. To do that we need common metadata, a common data management platform and enterprise data integration. We are surely at the point where the need for enterprise data integration is now upon us.
What’s the biggest obstacle to influencing traveler purchasing decisions in the hospitality industry?
Did you see the disturbing traveler survey results in theHotelExecutive.com article “The Evolution of Hotel Loyalty Programs: What Guests Expect in 2013”? It cited a study by Deloitte, which revealed that 30% of hotel loyalty program members are “at risk” of switching their preferred brand.
Also, although 50% of leisure travelers are members of a hotel loyalty program, they are not loyal to one brand. Sadly, most hotel loyalty programs have little or no impact on traveler purchasing decisions.
Why? Most hotel loyalty programs are not differentiated. Guest segmentation is unsophisticated. For example, a guest who stays 11 nights is treated the same way as someone who stays 49 nights because they are categorized under the same tier of loyalty. However, if a guest who stays 20 nights reserves a higher cost room than a similar guest, should you treat them differently?
Based on my 15 years of experience working with sales and marketing executives at leading hotels, resorts, and other hospitality organizations, I see a huge opportunity for those who want to increase market share. The key is to build a differentiated loyalty program, based on more refined and meaningful segments, which anticipate travelers’ personalized needs and allow you to deliver a consistent experience across all touchpoints.
So, what is holding sales and marketing back from creating more refined and meaningful segments for their loyalty programs? It’s not the applications that stand in their way. Most have implemented business intelligence, CRM systems, sales automation, service automation, and marketing automation.
The biggest obstacle is managing hospitality customer data. It’s distributed across the enterprise: locally at the hotels, in SaaS environments, or at the corporate data center. Take a look at this complexity:
- 1,000,000s of Guests
- 1,000s of Business Contacts
- 1,000s of Corporate Client Companies
- 1,000s of Meeting Planners
- 100s of Meeting Planning Companies
- 100s of Travel Agencies
- 1,000s of Travel Agents
- 100s of Brands
- 1,000s of Properties
- 1,000,000,000s of Interactions
- 1,000,000s of Transactions
This is an exponentially complex problem that can’t be managed in documents. Without a trusted, unified view into the amazingly complex network of hospitality customers, you can’t segment in a more refined and meaningful way.
Even though your valuable customer data is disconnected today, you can take steps to start connecting it to create a unified customer view. Others are already doing it. See Hyatt’s presentation at Informatica World for an example of how one innovative global hospitality organization mastered their hospitality customer data to power their sales and marketing vision. Hyatt implemented data integration, data quality, and master data management (MDM) solutions to gain a unified customer view.
With this approach, you can:
- Take disparate information for guests, corporate customers, and other customer types to create a trusted unified view of hospitality customers
- Manage business customer account hierarchies and household relationships so sales and marketing teams can visualize interactions between customers. For example, perhaps some of your top 100 business customers are also your top 100 leisure guests.
- Integrate this customer repository with transaction data like stays, reservations, events, food and beverage, customer service, web behavior, and social media to gain a trusted longitudinal view of the customer experience—from reservation through check-out.
- Create more refined and meaningful segments to gain key customer insights and better anticipate travelers’ personalized needs, and deliver a consistent experience across all touch points.
Are you interested in understanding more about this proven way to create a differentiated loyalty program and increase revenue and market share? If so, please leave a comment below and I will respond.
Larry Goldman is the president of AmberLeaf, a customer intelligence consultancy. With18 years of experience helping clients make strategic use of their information, Larry excels at helping sales and marketing teams implement best practices for marketing strategy, customer analytics, customer segmentation, data warehousing, and database marketing. He has delivered business-value focused solutions for Business Intelligence, CRM, Master Data Management (MDM), and Database Marketing.
In the past four years, Larry has focused on the hospitality industry. His industry expertise also includes Telecommunications (Wireless and Long Distance), High Technology, Broadcasting, Newspaper, On-Line Media Content providers, Cable, Telematics (In-Car Communications and Services), Education, Manufacturing, Retail (catalog, e-commerce, brick and mortar), Consumer Packaged Goods, and Financial Services (Insurance, Brokerage, Mortgage). Larry is an author and popular speaker on the topic of customer centricity and the use of analytics to improve business results.
Access to information has always been extremely important to people and organizations. In an increasingly complex and interconnected world, data is an essential competitive advantage for companies. With rapidly growing data volumes, increased complexity, and high market speed, our goal is simple: to easily connect people and data.
Turning data into business outcomes has always been our value proposition at Heiler Software. Using the value of data and information potential is totally in line with the positioning of Informatica. Unleashing the potential of information will help to make the careers of our customers, partners and employees even better.
From beginning of all conversations, from the announcement of the acquisition in October 2012 until today, Informatica’s managers and employees always stuck to the promise they made. That is a great commitment for our employees and customers. Now Informatica has announced the exciting news that the acquisition of Heiler is completed.
Heiler is now a part of the Informatica family. Our entire team is looking forward to the future with Informatica. For Heiler, the door for an exciting and successful future is wide open. Informatica will provide our customers and our employees a promising perspective in a dynamic industry.
Hundreds of customers rely on Informatica’s multi-domain MDM platform to manage customer, location, and asset data, and synchronize accurate master data across operational and analytic systems. I am sure Informatica is committed to being a trusted partner and will work to ensure success with all Heiler’s products.
Heiler has just released PIM 7 to speed up the time to market with all products, across all sales and marketing channels. Also, since March 2013, Procurement 7.1 is available. Informatica is known for innovation. I am convinced that Informatica will continue investing in our business. Their goal is to generate real-time commerce business processes and create a unique customer experience for our customers’ business. Our award winning PIM fits in the Universal MDM strategy to deliver to one vision: Enabling our customers to offer the right product, for the right customer, from the right supplier, at the right time, via the right channels and locations. It is all about inspiring.
Joining the forces will allow our customers to leverage Informatica’s expertise in data quality and data integration to deliver greater business value. With Informatica’s Data Quality offerings, our customers will be able to further accelerate the introduction of your products to market. Additionally, customers will be able to easily onboard data from their suppliers, then distribute to its customers and partners electronically with Informatica B2B. We share a common goal to establish the combination of Informatica MDM and Heiler PIM as the gold standard in the industry.
Another benefit of the acquisition is that all customers will receive world-class support from Informatica’s Global Customer Support organization, which delivers a comprehensive set of support programs including 24×7 support across 10 regional support centers. Customers have ranked Informatica as #1 in customer satisfaction for seven years in a row. In addition, Informatica’s strong global partner ecosystem brings the right resources to solve business and technical challenges across more industries.
By reaching this important milestone my mission as CEO of Heiler Software AG will be fulfilled. Personally, I’m going to stay connected to Informatica and I am excited to get involved in the future of this excellent and innovative company.
The future of Universal MDM is close to my heart.
Rolf J. Heiler, born 1959, married, three children, graduated in 1982 in Business management, majoring in IT and process organization. In 1987, Rolf Heiler founded Heiler Software GmbH. From 2000 Heiler Software was quoted on the stock exchange in 2000 in the “New Market” sector.
Title I HOPE’s (Homeless Outreach Program for Education) mission is to work to identify and enroll homeless students in the Clark County School District, collaborate with school personnel on homeless educational rights, and inform parents of the options available to their children under the McKinney-Vento Homeless Education Act. There have been over 6,800 students K-12th grade identified as homeless in our community.
The HOPE office strives to connect youth with resources and support services that will keep them in school. Services that are needed for our students range from school supplies to tutoring/mentoring to food. The HOPE staff partners with community providers to assist children with backpacks, food, clothing, shoes and hygiene items. The donations prepare students with basic needs which encourage them to come to school ready to learn.
Informatica World has been such a gracious contributor to the HOPE program. Last year the organization donated 1,000 food bags which ensured that all of our students who were tutored had a snack so they could focus on their work. You can check out the news story here about our great results.
Informatica also provided us with six iPads which were integrated into the A Place Called HOPE high school resource centers. Youth were able to have experience of utilizing the latest in technology because of Informatica World. The support and supplies provided by the organization and the participants of the conference will make a true difference in the lives of students who are struggling with basis needs. The items donated will assist student in focusing on their education and will help them to be successful in life.
The Informatica World 2013 event this week is truly making a difference in the lives of thousands of homeless students in our community. Title I HOPE thanks everyone for their generosity and willingness to give.
This was a guest blog penned by Title I HOPE.
Informatica’s Vibe virtual data machine can streamline big data work and allow data scientists to be more efficient
Informatica introduced an embeddable Vibe engine for not only transformation, but also for data quality, data profiling, data masking and a host of other data integration tasks. It will have a meaningful impact on the data scientist shortage.
Some clear economic facts are already apparent in the current world of data. Hadoop provides a significantly less expensive platform for gathering and analyzing data; cloud computing (potentially) is a more economical computing location than on-premises, if managed well. These are clearly positive developments. On the other hand, the human resources required to exploit these new opportunities are actually quite expensive. When there is greater demand than can be met in the short term for a hot product, suppliers put customers “on allocation” to manage the distribution to the most strategic customers.
This is the situation with “data scientists,” this new breed of experts with quantitative skills, data management skills, presentation skills and deep domain expertise. Current estimates are that there are 60,000 – 120,000 unfilled positions in the US alone. Naturally, data scientists are “allocated” to the most critical (economically lucrative) efforts, and their time is limited to those tasks that most completely leverage their unique skills.
To address this shortage, industry turns to universities to develop curricula to manufacture data scientists, but this will take time. In the meantime, salaries for data scientists are very high. Unfortunately, most data science work involves a great deal of effort that does not require data science skills, especially in the areas of managing the data prior to the insightful analytics. Some estimates are that data scientists spend 50-80% of their time finding and cleaning data, managing their computing platforms and writing programs. Reducing this effort with better tools can not only make data scientists more effective, it have an impact on the most expensive component of big data – human resources.
Informatica today introduced Vibe, its embeddable virtual data machine to do exactly that. Informatica has, for over 20 years, provided tools that allow developers to design and execute transformation of data without the need for writing or maintaining code. With Vibe, this capability is extended to include data quality, masking and profiling and the engine itself can be embedded in the platforms where the work is performed. In addition, the engine can generate separate code from a single data management design.
In the case of Hadoop, Informatica designers can continue to operate in the familiar design studio, and have Vibe generate the code for whatever platform is needed.In this way, it is possible for an Informatica developer to develop these data management routines for Hadoop, without learning Hadoop or writing code in Java. And the real advantage is that the data scientist is freed from work that can be performed by those in lower pay grades and can parallelize that work too – multiple programmers and integration developers to one data scientist.
Vibe is a major innovation for Informatica that provides many interesting opportunities for it’s customers. Easing the data scientist problem is only one.
This is a guest blog penned by Neil Raden, a well-known industry figure as an author, lecturer and practitioner. He has in-depth experience as a developer, consultant and analyst in all areas of Analytics and Decision Services including Big Data strategy and implementation, Business Intelligence, Data Warehousing, Statistical/Predictive Modeling, Decision Management, and IT systems integration including assessment, architecture, planning, project management and execution. Neil has authored dozens of sponsored white papers and articles, blogger and co-author of “Smart Enough) Systems” (Prentice Hall, 2007). He has 25 years as an actuary, software engineer and systems integrator.
The data warehouse’s goal is timely delivery of trusted data to support decision-enabling insights. However, it’s difficult to get insights out of an environment that’s hard to see inside of. This is why, as much as is possible given the necessities of data privacy, a data warehouse should be turned into a glass house, allowing us to see data quality and business intelligence challenges as they truly are.
Trusted data is not perfect data. Trusted data is transparent data, honest about its imperfections, and realistic about the practical trade-offs between delivery and quality. You can’t fix what you can’t see, but even more important, concealing or ignoring known data quality issues is only going to decrease business users’ trust of the data warehouse. Perfect data is impossible, but the more control enforced wherever data originates, and the more monitoring performed wherever data flows, the better overall data quality will be in the warehouse. (more…)
The reality in data warehousing is that the primary focus is on delivery. The data warehouse team is tasked with extracting, transforming, integrating, and loading data into the warehouse within increasingly tight timeframes. Twenty years ago, monthly data warehouse loads were common. Ten years ago, weekly loads became the norm. Five years ago, daily loads were called for. Nowadays, near-real-time analytics demands the data warehouse be loaded more frequently than once a day. (more…)
The Benefits of Product Information Management, by Andy Hayler, CEO of “The Information Difference”
A recent survey by The Information Difference of well over 100 large organisations found that, on average, they had nine separate systems providing competing sources of product data (13% of respondents had over 100 sources). As can be imagined, that diversity of product data creates headaches for anyone trying to measure business performance, e.g. “what are our most profitable products?” is an easy question to ask but a tough one to answer if no one can agree what a product is, or into which category it is placed.
It also presents operational problems: if you are a retailer who has high street stores, a print catalogue operation, and also an eCommerce web site, then how are you to ensure a consistent process for onboarding and updating product information if different parts of the business have different systems and definitions? Customers that see a special offer online will expect that offer to be available in a store or vice versa, and will not be happy if it is not. There are further issues with eCommerce compared to a retail store: in a store customers can touch and see a product, so online they need more detailed information in order to have the confidence to purchase, such as detailed images of the product and its specifications.
Various phases of application consolidation, including ERP, have failed to improve this situation. Master data management has evolved as a discipline and technology to provide dedicated hubs of high quality data in an enterprise that can serve other systems as needed. It may be impractical to switch off all those legacy systems, but you can put in place a new hub for your product data where the data is trustworthy. This can then be linked directly back to other systems, either in batch or in real time via a web service, so that new product data, when updated in the master data hub, can be immediately used in other systems such as an inventory system or eCommerce web site.
There are various approaches to master data management: some technologies are designed to deal with all kinds of different master data (customer, product, asset, location etc.) while others specialise in a particular data domain, such as product or customer. There are reasons why specialising can make sense. Product data is much more complex than customer name and address data, with materials master files often appearing in unstructured files. Such data needs to be parsed and structured and then validated, requiring different approaches to those used to handle address data. Moreover the classification of products can be complex, with something like a camera having a large number of components and options, so systems to handle product data must be strong at handling complex classification hierarchies.
One example of a company confronting this issue is Kramp, Europe’s largest wholesaler of spare parts for motorized equipment. With 2,000 suppliers they used to take weeks to transfer new product data from suppliers into their internal systems and its eCommerce hub. By implementing a product data hub they were able to radically streamline this process, allowing suppliers to interact directly with the product data hub, and for this data to be consistently updated in the systems that need it, without need for time-consuming interactions with the suppliers to discuss particular data formats. This has led to higher margins due to being able to take advantage of “Long Tail’ niche items, lower process costs and quicker reaction time, important in new markets.
Improved multichannel processes, such as in this example, are why more and more companies are evaluating master data management solutions in order to finally tackle the issue of inconsistent product data. The evident benefits that such improvements bring means that businesses see real, quantified benefits, and why master data management is arguably the fastest growing enterprise software segment right now.