Category Archives: Cloud Data Integration

Cloud Data Integration

Banks and the Art of the Possible: Disruptors are Re-shaping Banking

Banks and the Art of the Possible

Banks and the Art of the Possible: Disruptors are Re-shaping Banking

The problem many banks encounter today is that they have vast sums of investment tied up in old ways of doing things. Historically, customers chose a bank and remained ’loyal’ throughout their lifetime…now competition is rife and loyalty is becoming a thing of a past. In order to stay ahead of the competition, gain and keep customers, they need to understand the ever-evolving market, disrupt norms and continue to delight customers. The tradition of staying with one bank due to family convention or from ease has now been replaced with a more informed customer who understands the variety of choice at their fingertips.

Challenger Banks don’t build on ideas of tradition and legacy and see how they can make adjustments to them. They embrace change. Longer-established banks can’t afford to do nothing, and assume their size and stature will attract customers.

Here’s some useful information

Accenture’s recent report, The Bank of Things, succinctly explains what ‘Customer 3.0’ is all about. The connected customer isn’t necessarily younger. It’s everybody. Banks can get to know their customers better by making better use of information. It all depends on using intelligent data rather than all data. Interrogating the wrong data can be time-consuming, costly and results in little actionable information.

When an organisation sets out with the intention of knowing its customers, then it can calibrate its data according with where the gold nuggets – the real business insights – come from. What do people do most? Where do they go most? Now that they’re using branches and phone banking less and less – what do they look for in a mobile app?

Customer 3.0 wants to know what the bank can offer them all-the-time, on the move, on their own device. They want offers designed for their lifestyle. Correctly deciphered data can drive the level of customer segmentation that empowers such marketing initiatives. This means an organisation has to have the ability and the agility to move with its customers. It’s a journey that never ends -technology will never have a cut-off point just like customer expectations will never stop evolving.

It’s time for banks to re-shape banking

Informatica have been working with major retail banks globally to redefine banking excellence and realign operations to deliver it. We always start by asking our customers the revealing question “Have you looked at the art of the possible to future-proof your business over the next five to ten years and beyond?” This is where the discussion begins to explore really interesting notions about unlocking potential. No bank can afford to ignore them.

Share
Posted in B2B Data Exchange, Banking & Capital Markets, Business Impact / Benefits, Business/IT Collaboration, Cloud Data Integration, Data Services, Financial Services | Tagged , , , , | Leave a comment

Gamers Need Great Data and Connected Platforms

Great Data and Connected Platforms

Gamers Need Great Data and Connected Platforms

Who remembers their first game of Pong? Celebrating more than 40 years of innovation, gaming is no longer limited to monochromatic screens and dedicated, proprietary platforms. The PC gaming industry is expected to exceed $35bn by 2018. Phone and handheld games is estimated at $34bn in 5 years and quickly closing the gap. According to EEDAR, 2014 recorded more than 141 million mobile gamers just in North America, generating $4.6B in revenue for mobile game vendors.

This growth has spawned a growing list of conferences specifically targeting gamers, game developers, the gaming industry and more recently gaming analytics! This past weekend in Boston, for example, was PAX East where people of all ages and walks of life played games on consoles, PC, handhelds, and good old fashioned board games. With my own children in attendance, the debate of commercial games versus indie favorites, such as Minecraft , dominates the dinner table.

Online games are where people congregate online, collaborate, and generate petabytes of data daily. With the added bonus of geospatial data from smart phones, the opportunity for more advanced analytics. Some of the basic metrics that determine whether a game is successful, according to Ninja Metrics, include:

  • New Users, Daily Active Users, Retention
  • Revenue per user
  • Session length and number of sessions per user

Additionally, they provide predictive analytics, customer lifetime value, and cohort analysis. If this is your gig, there’s a conference for that as well – the Gaming Analytics Summit !

At the Game Developers Conference recently held in San Francisco, the focus of this event has shifted over the years from computer games to new gaming platforms that need to incorporate mobile, smartphone, and online components. In order to produce a successful game, it requires the following:

  • Needs to be able to connect to a variety of devices and platforms
  • Needs to use data to drive decisions and improve user experience
  • Needs to ensure privacy laws are adhered to.

Developers are able to quickly access online gaming data and tweak or change their sprites’ attributes dynamically to maximize player experience.

When you look at what is happening in the gaming industry, you can start to see why colleges and universities like my own alma mater, WPI, now offers a computer science degree in Interactive Media and Game Design degree . The IMGD curriculum includes heavy coursework in data science, game theory, artificial intelligence and story boarding. When I asked a WPI IMGD student about what they are working on, they are mapping out decision trees that dictate what adversary to pop up based on the player’s history (sounds a lot like what we do in digital marketing…).

As we start to look at the Millennial Generation entering into the workforce, maybe we should look at our own recruiting efforts and consider game designers. They are masters in analytics and creativity with an appreciation for the importance of great data. Combining the magic and the math makes a great gaming experience. Who wouldn’t want that for their customers?

Share
Posted in B2B, Business Impact / Benefits, Cloud, Cloud Data Integration, DaaS, Data Integration Platform, Data Services | Tagged , , | Leave a comment

Connected Data for SaaS

Connected_Data

Connected Data for SaaS

Informatica, over the last two years, successfully transformed from running 80% of its application portfolio on premises to 80% in the cloud. Success was based on two key criteria:

  • Ensuring the SaaS-based processes are integrated with no disruption
  • Data in the cloud continues to be available and accessible for analytics

With industry analysts predicting that the majority of new application deployments will be SaaS-based by 2017, the requirement of having connected data should not be negotiable. It is a must have. Most SaaS applications ensure businesses are able to keep processes integrated using connected and shared data through application programming interfaces (APIs).

If you are a consumer of SaaS applications, you probably know the importance of having clean, connected and secure data from the cloud. The promise of SaaS is improved agility. When data is not easily accessible, that promise is broken. With the plethora of options available in the SaaS ecosystem and marketplace, not having clean, connected and safe data is a compelling event for switching SaaS vendors.

If you are in the SaaS application development industry, you probably know that building these APIs and connectors is a critical requirement for success. However, how do you decide which applications you should build connectors for when the ecosystem keeps changing? Investment in developing connectors and interfaces consumes resources and competes with developing competitive and differentiating features.

This week, Informatica launched its inaugural DataMania event in San Francisco where the leading topic was SaaS application and data integration. Speakers from AWS, Adobe, App Dynamics, Dun & Bradstreet, and Marketo – to name a few – contributed to the discussion and confirmed that we entering into the era of the Data Ready Enterprise. Also during the event, Informatica announced the Connect-a-thon, a hackathon-like event, where SaaS vendors can get connected to hundreds of cloud and on-premises apps.

Without a doubt, transitioning to a cloud and SaaS-based application architecture can only be successful if the applications are easily connectable with shared data. Here at Informatica, this was absolutely the case. Whether you are in the business or a consumer of SaaS applications, consider the benefits of using a standard library of connectors, such as what Informatica Cloud offers so you can focus your time and energy on innovation and more strategic parts of your business.

Share
Posted in Cloud, Cloud Application Integration, Cloud Data Integration | Tagged , , , | Leave a comment

Informatica joins new ServiceMax Marketplace – offers rapid, cost effective integration with ERP and Cloud apps for Field Service Automation

ERP

Informatica Partners with ServiceMax

To deliver flawless field service, companies often require integration across multiple applications for various work processes.  A good example is automatically ordering and shipping parts through an ERP system to arrive ahead of a timely field service visit.  Informatica has partnered with ServiceMax, the leading field service automation solution, and subsequently joined the new ServiceMax Marketplace to offer customers integration solutions for many ERP and Cloud applications frequently involved in ServiceMax deployments.  Comprised of Cloud Integration Templates built on Informatica Cloud for frequent customer integration “patterns”, these solutions will speed and cost contain the ServiceMax implementation cycle and help customers realize the full potential of their field service initiatives.

Existing members of the ServiceMax Community can see a demo or take advantage of a free 30-day trial that provides full capabilities of Informatica Cloud Integration for ServiceMax with prebuilt connectors to hundreds of 3rd party systems including SAP, Oracle, Salesforce, Netsuite and Workday, powered by the Informatica Vibe virtual data machine for near-universal access to cloud and on-premise data.  The Informatica Cloud Integration for Servicemax solution:

  • Accelerates ERP integration through prebuilt Cloud templates focused on key work processes and the objects on common between systems as much as 85%
  • Synchronizes key master data such as Customer Master, Material Master, Sales Orders, Plant information, Stock history and others
  • Enables simplified implementation and customization through easy to use user interfaces
  • Eliminates the need for IT intervention during configuration and deployment of ServiceMax integrations.

We look forward to working with ServiceMax through the ServiceMax Marketplace to help joint customers deliver Flawless Service!

Share
Posted in 5 Sales Plays, Business Impact / Benefits, Cloud, Cloud Application Integration, Cloud Data Integration, Cloud Data Management, Data Integration, Data Integration Platform, Data Migration, Data Synchronization, Operational Efficiency, Professional Services, SaaS | Tagged , , , | Leave a comment

Data Mania- Using REST APIs and Native Connectors to Separate Yourself from the SaaS Pack

Data Mania

Data Mania: An Event for SaaS & ISV Leaders

With Informatica’s Data Mania on Wednesday, I’ve been thinking a lot lately about REST APIs. In particular, I’ve been considering how and why they’ve become so ubiquitous, especially for SaaS companies. Today they are the prerequisite for any company looking to connect with other ecosystems, accelerate adoption and, ultimately, separate themselves from the pack.

Let’s unpack why.

To trace the rise of the REST API, we’ll first need to take a look at the SOAP web services protocol that preceded it.  SOAP is still very much in play and remains important to many application integration scenarios. But it doesn’t receive much use or love from the thousands of SaaS applications that just want to get or place data with one another or in one of the large SaaS ecosystems like Salesforce.

Why this is the case has more to do with needs and demands of a SaaS business than it does with the capabilities of SOAP web services. SOAP, as it turns out, is perfectly fine for making and receiving web service calls, but it does require work on behalf of both the calling application and the producing application. And therein lies the rub.

SOAP web service calls are by their very nature incredibly structured arrangements, with specifications that must be clearly defined by both parties. Only after both the calling and producing application have their frameworks in place can the call be validated. While the contract within SOAP WSDLs makes SOAP more robust, it also makes it too rigid, and less adaptable to change. But today’s apps need a more agile and more loosely defined API framework that requires less work to consume and can adapt to the inevitable and frequent changes demanded by cloud applications.

Enter REST APIs

REST APIs are the perfect vehicle for today’s SaaS businesses and mash-up applications. Sure, they’re more loosely defined than SOAP, but when all you want to do is get and receive some data, now, in the context you need, nothing is easier or better for the job than a REST API.

With a REST API, the calls are mostly done as HTTP with some loose structure and don’t require a lot of mechanics from the calling application, or effort on behalf of the producing application.

SaaS businesses prefer REST APIs because they are easy to consume. They also make it easy to onboard new customers and extend the use of the platform to other applications. The latter is important because it is primarily through integration that SaaS applications get to become part of an enterprise business process and gain the stickiness needed to accelerate adoption and growth.

Without APIs of any sort, integration can only be done through manual data movement, which opens the application and enterprise up to the potential errors caused by fat-finger data movement. That typically will give you the opposite result of stickiness, and is to be avoided at all costs.

While publishing an API as a way to get and receive data from other applications is a great start, it is just a means to an end. If you’re a SaaS business with greater ambitions, you may want to consider taking the next step of building native connectors to other apps using an integration system such as Informatica Cloud. A connector can provide a nice layer of abstraction on the APIs so that the data can be accessed as application data objects within business processes. Clearly, stickiness with any SaaS application improves in direct proportion to the number of business processes or other applications that it is integrated with.

The Informatica Cloud Connector SDK is Java-based and enables you easily to cut and paste the code necessary to create the connectors. Informatica Cloud’s SDKs are also richer and make it possible for you to adapt the REST API to something any business user will want to use – which is a huge advantage.

In addition to making your app stickier, native connectors have the added benefit of increasing your portability. Without this layer of abstraction, direct interaction with a REST API that’s been structurally changed would be impossible without also changing the data flows that depend on it. Building a native connector makes you more agile, and inoculates your custom built integration from breaking.

Building your connectors with Informatica Cloud also provides you with some other advantages. One of the most important is entrance to a community that includes all of the major cloud ecosystems and the thousands of business apps that orbit them. As a participant, you’ll become part of an interconnected web of applications that make up the business processes for the enterprises that use them.

Another ancillary benefit is access to integration templates that you can easily customize to connect with any number of known applications. The templates abstract the complexity from complicated integrations, can be quickly customized with just a few composition screens, and are easily invoked using Informatica Cloud’s APIs.

The best part of all this is that you can use Informatica Cloud’s integration technology to become a part of any business process without stepping outside of your application.

For those interested in continuing the conversation and learning more about how leading SaaS businesses are using REST API’s and native connectors to separate themselves, I invite you to join me at Data Mania, March 4th in San Francisco. Hope to see you there.

Share
Posted in B2B, B2B Data Exchange, Cloud, Cloud Application Integration, Cloud Data Integration, Data Integration, Data Services, SaaS | Tagged , , , , , , , | Leave a comment

Informatica Supports New Custom ODBC/JDBC Drivers for Amazon Redshift

Informatica’s Redshift connector is a state-of-the-art Bulk-Load type connector which allows users to perform all CRUD operations on Amazon Redshift. It makes use of AWS best practices to load data at high throughput in a safe and secure manner and is available on Informatica Cloud and PowerCenter.

Today we are excited to announce the support of Amazon’s newly launched custom JDBC and ODBC drivers for Redshift. Both the drivers are certified for Linux and Windows environments.

Informatica’s Redshift connector will package the JDBC 4.1 driver which further enhances our meta-data fetch capabilities for tables and views in Redshift. That improves our overall design-time responsiveness by over 25%. It also allows us to query multiple tables/views and retrieve the result-set using primary and foreign key relationships.

Amazon’s ODBC driver enhances our FULL Push Down Optimization capabilities on Redshift. Some of the key differentiating factors are support for the SYSDATE variable, functions such as ADD_TO_DATE(), ASCII(), CONCAT(), LENGTH(), TO_DATE(), VARIANCE() etc. which weren’t possible before.

Amazon’s ODBC driver is not pre-packaged but can be directly downloaded from Amazon’s S3 store.

Once installed, the user can change the default ODBC System DSN in ODBC Data Source Administrator.

Redshift

To learn more, sign up for the free trial of Informatica’s Redshift connector for Informatica Cloud or PowerCenter.

Share
Posted in Business Impact / Benefits, Business/IT Collaboration, Cloud, Cloud Application Integration, Cloud Computing, Cloud Data Integration, Cloud Data Management | Tagged , , , | Leave a comment

Strata 2015 – Making Data Work for Everyone with Cloud Integration, Cloud Data Management and Cloud Machine Learning

Making Data Work for Everyone with Cloud Integration, Cloud Data Management and Cloud Machine Learning

Making Data Work for Everyone with Cloud

Are you ready to answer “Yes” to the questions:

a) “Are you Cloud Ready?”
b) “Are you Machine Learning Ready?”

I meet with hundreds of Informatica Cloud customers and prospects every year. While they are investing in Cloud, and seeing the benefits, they also know that there is more innovation out there. They’re asking me, what’s next for Cloud? And specifically, what’s next for Informatica in regards to Cloud Data Integration and Cloud Data Management? I’ll share more about my response throughout this blog post.

The spotlight will be on Big Data and Cloud at the Strata + Hadoop World conference taking place in Silicon Valley from February 17-20 with the theme  “Make Data Work”. I want to focus this blog post on two topics related to making data work and business insights:

  • How existing cloud technologies, innovations and partnerships can help you get ready for the new era in cloud analytics.
  • How you can make data work in new and advanced ways for every user in your company.

Today, Informatica is announcing the availability of its Cloud Integration Secure Agent on Microsoft Azure and Linux Virtual Machines as well as an Informatica Cloud Connector for Microsoft Azure Storage. Users of Azure data services such as Azure HDInsight, Azure Machine Learning and Azure Data Factory can make their data work with access to the broadest set of data sources including on-premises applications, databases, cloud applications and social data. Read more from Microsoft about their news at Strata, including their relationship with Informatica, here.

“Informatica, a leader in data integration, provides a key solution with its Cloud Integration Secure Agent on Azure,” said Joseph Sirosh, Corporate Vice President, Machine Learning, Microsoft. “Today’s companies are looking to gain a competitive advantage by deriving key business insights from their largest and most complex data sets. With this collaboration, Microsoft Azure and Informatica Cloud provide a comprehensive portfolio of data services that deliver a broad set of advanced cloud analytics use cases for businesses in every industry.”

Even more exciting is how quickly any user can deploy a broad spectrum of data services for cloud analytics projects. The fully-managed cloud service for building predictive analytics solutions from Azure and the wizard-based, self-service cloud integration and data management user experience of Informatica Cloud helps overcome the challenges most users have in making their data work effectively and efficiently for analytics use cases.

The new solution enables companies to bring in data from multiple sources for use in Azure data services including Azure HDInsight, Azure Machine Learning, Azure Data Factory and others – for advanced analytics.

The broad availability of Azure data services, and Azure Machine Learning in particular, is a game changer for startups and large enterprises. Startups can now access cloud-based advanced analytics with minimal cost and complexity and large businesses can use scalable cloud analytics and machine learning models to generate faster and more accurate insights from their Big Data sources.

Success in using machine learning requires not only great analytics models, but also an end-to-end cloud integration and data management capability that brings in a wide breadth of data sources, ensures that data quality and data views match the requirements for machine learning modeling, and an ease of use that facilitates speed of iteration while providing high-performance and scalable data processing.

For example, the Informatica Cloud solution on Azure is designed to deliver on these critical requirements in a complementary approach and support advanced analytics and machine learning use cases that provide customers with key business insights from their largest and most complex data sets.

Using the Informatica Cloud solution on Azure connector with Informatica Cloud Data Integration enables optimized read-write capabilities for data to blobs in Azure Storage. Customers can use Azure Storage objects as sources, lookups, and targets in data synchronization tasks and advanced mapping configuration tasks for efficient data management using Informatica’s industry leading cloud integration solution.

As Informatica fulfills the promise of “making great data ready to use” to our 5,500 customers globally, we continue to form strategic partnerships and develop next-generation solutions to stay one step ahead of the market with our Cloud offerings.

My goal in 2015 is to help each of our customers say that they are Cloud Ready! And collaborating with solutions such as Azure ensures that our joint customers are also Machine Learning Ready!

To learn more, try our free Informatica Cloud trial for Microsoft Azure data services.

Share
Posted in B2B, Business Impact / Benefits, Business/IT Collaboration, Cloud, Cloud Application Integration, Cloud Computing, Cloud Data Integration, Data Services | Tagged , , , , , , , , , , , , , , , , | 2 Comments

Payers – What They Are Good At, And What They Need Help With

healthcare_bigdata

Payers – What They Are Good At, And What They Need Help With

In our house when we paint a room, my husband does the big rolling of the walls or ceiling, I do the cut-in work. I am good at prepping the room, taping all the trim and deliberately painting the corners. However, I am thrifty and constantly concerned that we won’t have enough paint to finish a room. My husband isn’t afraid to use enough paint and is extremely efficient at painting a wall in a single even coat. As a result, I don’t do the big rolling and he doesn’t do the cutting in. It took us awhile to figure this out, and a few rooms had to be repainted while we were figuring it out.  Now we know what we are good at, and what we need help with.

Payers roles are changing. Payers were previously focused on risk assessment, setting and collecting premiums, analyzing claims and making payments – all while optimizing revenues. Payers are pretty good at selling to employers, figuring out the cost/benefit ratio from an employers perspective and ensuring a good, profitable product. With the advent of the Affordable Healthcare Act along with a much more transient insured population, payers now must focus more on the individual insured and be able to communicate with the individuals in a more nimble manner than in the past.

Individual members will shop for insurance based on consumer feedback and price. They are interested in ease of enrollment and the ability to submit and substantiate claims quickly and intuitively. Payers are discovering that they need to help manage population health at a individual member level. And population health management requires less of a business-data analytics approach and more social media and gaming-style logic to understand patients. In this way, payers can help develop interventions to sustain behavioral changes for better health.

When designing such analytics, payers should consider the following key design steps:

Due to payers’ mature predictive analytics competencies, they will have a much easier time in the next generation of population behavior compared to their provider counterparts. As clinical content is often unstructured compared to the claims data, payers need to pay extra attention to context and semantics when deciphering clinical content submitted by providers. Payers can use help from vendors that can help them understand unstructured data, individual members. They can then use that data to create fantastic predictive analytic solutions.

Share
Posted in 5 Sales Plays, Application Retirement, Big Data, Cloud Application Integration, Cloud Data Integration, Customer Acquisition & Retention, Customer Services, Data Governance, Data Quality, Governance, Risk and Compliance, Healthcare, Total Customer Relationship | Tagged | Leave a comment

What’s All the Mania Around SaaS Data?

SaaSIt’s no secret that the explosion of software-as-a-service (SaaS) apps has revolutionized the way businesses operate. From humble beginnings, the titans of SaaS today include companies such as Salesforce.com, NetSuite, Marketo, and Workday that have gone public and attained multi-billion dollar valuations.  The success of these SaaS leaders has had a domino effect in adjacent areas of the cloud – infrastructure, databases, and analytics.

Amazon Web Services (AWS), which originally had only six services in 2006 with the launch of Amazon EC2, now has over 30 ranging from storage, relational databases, data warehousing, Big Data, and more. Salesforce.com’s Wave platform, Tableau Software, and Qlik have made great advances in the cloud analytics arena, to give better visibility to line-of-business users. And as SaaS applications embrace new software design paradigms that extend their functionality, application performance monitoring (APM) analytics has emerged as a specialized field from vendors such as New Relic and AppDynamics.

So, how exactly did the growth of SaaS contribute to these adjacent sectors taking off?

The growth of SaaS coincided with the growth of powerful smartphones and tablets. Seeing this form factor as important to the end user, SaaS companies rushed to produce mobile apps that offered core functionality on their mobile device. Measuring adoption of these mobile apps was necessary to ensure that future releases met all the needs of the end user. Mobile apps contain a ton of information such as app responsiveness, features utilized, and data consumed. As always, there were several types of users, with some preferring a laptop form factor over a smartphone or tablet. With the ever increasing number of data points to measure within a SaaS app, the area of application performance monitoring analytics really took off.

Simultaneously, the growth of the SaaS titans cemented their reputation as not just applications for a certain line-of-business, but into full-fledged platforms. This growth emboldened a number of SaaS startups to develop apps that solved specialized or even vertical business problems in healthcare, warranty-and-repair, quote-to-cash, and banking. To get started quickly and scale rapidly, these startups leveraged AWS and its plethora of services.

The final sector that has taken off thanks to the growth of SaaS is the area of cloud analytics. SaaS grew by leaps and bounds because of its ease of use, and rapid deployment that could be achieved by business users. Cloud analytics aims to provide the same ease of use for business users when providing deep insights into data in an interactive manner.

In all these different sectors, what’s common is the fact that SaaS growth has created an uptick in the volume of data and the technologies that serve to make it easier to understand. During Informatica’s Data Mania event (March 4th, San Francisco) you’ll find several esteemed executives from Salesforce, Amazon, Adobe, Microsoft, Dun & Bradstreet, Qlik, Marketo, and AppDynamics talk about the importance of data in the world of SaaS.

Share
Posted in Business Impact / Benefits, Business/IT Collaboration, Cloud, Cloud Application Integration, Cloud Computing, Cloud Data Integration, Cloud Data Management, SaaS | Tagged , , , , | Leave a comment

Data Streams, Data Lakes, Data Reservoirs, and Other Large Data Bodies

data lake

Data Lake is a catchment area for data entering the organization

A Data Lake is a simple concept. They are a catchment area for data entering the organization. In the past, most businesses didn’t need to organize such a data store because almost all data was internal. It traveled via traditional ETL mechanisms from transactional systems to a data warehouse and then was sprayed around the business, as required.

When a good deal of data comes from external sources, or even from internal sources like log files, which never previously made it into the data warehouse, there is a need for an “operational data store.” This has definitely become the premier application for Hadoop and it makes perfect sense to me that such technology be used for a data catchment area. The neat thing about Hadoop for this application is that:

  1. It scales out “as far as the eye can see,” so there’s no likelihood of it being unable to manage the data volumes even when they grow beyond the petabyte level.
  2. It is a key-value store, which means that you don’t need to expend much effort in modeling data when you decide to accommodate a new data source. You just define a key and define the metadata at leisure.
  3. The cost of the software and the storage is very low.

So let’s imagine that we have a need for a data catchment area, because we have decided to collect data from log-files, mobile devices, social networks, from public data sources, or whatever. So let us also imagine that we have implemented Hadoop and some of its useful components and we have begun to collect data.

Is it reasonable to describe this as a data lake?

A Hadoop implementation should not be a set of servers randomly placed at the confluence of various data flows. The placement needs to be carefully considered and if the implementation is to resemble a “data lake” in any way, then it must be a well-engineered man-made lake. Since the data doesn’t just sit there until it evaporates but eventually flows to various applications, we should think of this as a “data reservoir” rather than a “data lake.”

There is no point in arranging all that data neatly along the aisles because when we get it, we may not know what we want to do with it at the time we get it. We should organize the data when we know that.

Another reason we should think of this as more like a reservoir than a lake is that we might like to purify the data a little before sending it down the pipes to applications or users that want to use it.

Twitter @bigdatabeat

Share
Posted in Architects, Big Data, CIO, Cloud Data Integration, Cloud Data Management, DaaS, Hadoop, IaaS | Tagged , , , , , | Leave a comment