In enterprises across the globe, from data centers into the executive suites, everyone is asking the same questions: What is Hadoop, and how can it help us with our Big Data challenges?
The groundswell of interest in Hadoop – an open-source software framework that enables applications to run across large arrays of nodes, accessing petabytes’ worth of data – was discussed by James Kobielus, Forrester’s Big Data and Hadoop expert, at the opening session of the Hadoop Tuesday Webinar series, sponsored by Informatica and Cloudera. (Replay available here.) I had the opportunity to join Jim, along with Julianna DeLua, Enterprise Solution Evangelist for Big Data at Informatica, for a discussion of Hadoop’s growth across the business world.
“Hadoop is in heavy evaluation pretty much everywhere, and that’s only a slight exaggeration,” Jim pointed out. “Hadoop is seen widely now as the next generation of big data processing and storage.”
Hadoop is very much the heart of many of Forrester’s customer inquiries now, “both from users and solution providers,” he added. “They want to take this technology, this new approach, and they want to be able to integrate it more tightly in their operations if they’re users. And into their product portfolios if they’re a solution provider.”
Solution providers are also seeing a great deal of inquiries about Hadoop from enterprise customers – not only from the technical ranks, but from the executive suite as well, Julianna added. “There’s tremendous interest, but also market confusion,” she said. “Our customers have invested a tremendous amount of money, and resources into the existing IT infrastructure. The question is, what does Hadoop do – is this a replacement technology, or is this augmenting our technology?” The answer is that Hadoop is paving the way to analytical capabilities previously not available, she continued. “Tasks that used to take weeks come down to days. With an ability to store and analyze huge amounts of data, the era of sampling is coming to the end. For certain applications such as log analysis, even for network and application-level logs, we’re going from a very limited, average-oriented approach into an all-data type of approach.”
Areas where Hadoop is already providing value include CRM, content management, and sentiment analysis. It is gaining traction among “those that are the C-level sponsors who need to be able to analyze petabytes worth of information streaming in all the time,” Jim said. Log analysis is a particularly strong area as well – perhaps one of the “early killer apps for Hadoop,” he added. “CTOs are looking for the ability to process petabytes worth of log data, in real time. They need to do root cause analysis of problems across complex networks.”
Forrester’s latest survey research shows about 37% of companies have Hadoop projects underway within their enterprises. There are new types of applications unfolding every day. “We’re also seeing Hadoop in a broad range of other areas, such as doing content ETL and digital media,” Jim said. “Online publishers need to be able to render content, transform it in real time and deliver downstream to a broad range of consumers. The range of Hadoop applications continues to grow, and the range of business solutions built on Hadoop continues to grow.”
In the second Hadoop Tuesday Webcast (October 11th), John Akred of Accenture will be delving into the architectural aspects of Hadoop, as well as its role in enabling Data as a Platform.
Future guests for Hadoop Tuesdays include Matt Aslett of The 451Group (October 18), David Menninger of Ventana Research (October 25), Omer Trajman of Cloudera (November 15), David Linthicum of Blue Mountain Labs (November 29), Charles Zedlewski of Cloudera and Wei Zheng of Informatica (December 13). Executives from companies that have already implemented Hadoop within their data operations will also be joining us.