Tag Archives: cloud
With Informatica Cloud, we’ve long tracked the growth of the various cloud apps and its adoption in the enterprise. Common business patterns – such as opportunity-to-order, employee onboarding, data migration and business intelligence – that once took place solely on-premises are now being conducted both in the cloud and on-premises.
The fact is that we are well on our way to a world where our business needs are best met by a mix of on-premises and cloud applications. Regardless of what we do or make, we can no longer get away with just on-premises applications – or at least not for long. As we become more reliant on cloud services, such as those offered by Oracle, Salesforce, SAP, NetSuite, Workday, we are embracing the reality of a new hybrid world, and the imperative for simpler integration it demands.
So, as the ground shifts beneath us, moving us toward the hybrid world, we, as business and IT users, are left standing with a choice: Continue to seek solutions in our existing on-premises integration stacks, or go beyond, to find them with the newer and simpler cloud solution. Let us briefly look at five business patterns we’ve been tracking.
One of the first things we’ve noticed with the hybrid environment is the incredible frequency with which data is moved back and forth between the on-premises and cloud environments. We call this the data integration pattern, and it is best represented by getting data, such as price list or inventory from Oracle E-Business into a cloud app so that the actual user of the cloud app can view the most updated information. Here the data (usually master data) is copied toserves a certain purpose. Data Integration also involves the typical needs of data to be transformed before it can be inserted or updated. The understanding of metadata and data models of the involved applications is key to do this effectively and repeatedly.
The second is the application integration pattern, or the real time transaction flow between your on-premises and cloud environment, where you have business processes and services that need to communicate with one another. Here, the data needs to be referenced in real time for a knowledge worker to take action.
The third, data warehousing in the cloud, is an emerging pattern that is gaining importance for both mid- and large-size companies. In this pattern, businesses are moving massive amounts of data in bulk from both on-premises and cloud sources into a cloud data warehouse, such as Amazon Redshift, for BI analysis.
The fourth, the Internet of Things (IOT) pattern, is also emerging and is becoming more important, especially as new technologies and products, such as Nest, enable us to push streaming data (sensor data, web logs, etc.) and combine them with other cloud and on-premises data sources into a cloud data store. Often the data is unstructured and hence it is critical for an integration platform to effectively deal with unstructured data.
The fifth and final pattern, API integration, is gaining prominence in the cloud. Here, an on-premise or cloud application exposes the data or service as an external API that can be consumed directly by applications or by a higher-level composite app in an orchestration.
While there are certainly different approaches to the challenges brought by Hybrid IT, cloud integration is often best-suited to solving them.
First, while the integration problems are more or less similar to the on-premise world, the patterns now overlap between cloud and on-premise. Second, integration responsibility is now picked up at the edge, closer to the users, whom we call “citizen integrators”. Third, time to market and agility demands that any integration platform you work with can live up to your expectations of speed. There are no longer multiyear integration initiatives in the era of the cloud. Finally, the same values that made cloud application adoption attractive (such as time-to-value, manageability, low operational overhead) also apply to cloud integration.
One of the most important forces driving cloud adoption is the need for companies to put more power into hands of the business user. These users often need to access data in other systems and they are quite comfortable going through the motions of doing so without actually being aware that they are performing integration. We call this class of users ‘Citizen Integrators’. For example, if a user uploads an excel file to Salesforce, it’s not something they would call as “integration”. It is an out-of-the-box action that is integrated with their user experience and is simple to use from a tooling point of view and oftentimes native within the application they are working with.
Cloud Integration Convergence is driving many integration use cases. The most common integration – such as employee onboarding – can span multiple integration patterns. It involves data integration, application integration and often data warehousing for business intelligence. If we agree that doing this in the cloud makes sense, the question is whether you need three different integration stacks in the cloud for each integration pattern. And even if you have three different stacks, what if an integration flow involves the comingling of multiple patterns? What we are noticing is a single Cloud Integration platform to address more and more of these use cases and also providing the tooling for both a Citizen Integrator as well as an experienced Integration Developer.
The bottom line is that in the new hybrid world we are seeing a convergence, where the industry is moving towards streamlined and lighter weight solutions that can handle multiple patterns with one platform.
The concept of Cloud Integration Convergence is an important one and we have built its imperatives into our products. With our cloud integration platform, we combine the ability to handle any integration pattern with an easy-to-use interface that empowers citizen integrators, and frees integration developers for more rigorous projects. And because we’re Informatica, we’ve designed it to work in tandem with PowerCenter, which means anything you’ve developed for PowerCenter can be leveraged for Informatica Cloud and vice versa thereby fulfilling Informatica’s promise of Map Once, Deploy Anywhere.
In closing, I invite you to visit us at the Informatica booth at Oracle Open World in booth #3512 in Moscone West. I’ll be there with some of my colleagues, and we would be happy to meet and talk with you about your experiences and challenges with the new Hybrid IT world.
When it comes to cloud-based data analytics, a recent study by Ventana Research (as found in Loraine Lawson’s recent blog post) provides a few interesting data points. The study reveals that 40 percent of respondents cited lowered costs as a top benefit, improved efficiency was a close second at 39 percent, and better communication and knowledge sharing also ranked highly at 34 percent.
Ventana Research also found that organizations cite a unique and more complex reason to avoid cloud analytics and BI. Legacy integration work can be a major hindrance, particularly when BI tools are already integrated with other applications. In other words, it’s the same old story:
The ability to deal with existing legacy systems when moving to concepts such as big data or cloud-based analytics is critical to the success of any enterprise data analytics strategy. However, most enterprises don’t focus on data integration as much as they should, and hope that they can solve the problems using ad-hoc approaches.
You can’t make sense of data that you can’t see.
These approaches rarely work as well a they should, if at all. Thus, any investment made in data analytics technology is often diminished because the BI tools or applications that leverage analytics can’t see all of the relevant data. As a result, only part of the story is told by the available data, and those who leverage data analytics don’t rely on the information, and that means failure.
What’s frustrating to me about this issue is that the problem is easily solved. Those in the enterprise charged with standing up data analytics should put a plan in place to integrate new and legacy systems. As part of that plan, there should be a common understanding around business concepts/entities of a customer, sale, inventory, etc., and all of the data related to these concepts/entities must be visible to the data analytics engines and tools. This requires a data integration strategy, and technology.
As enterprises embark on a new day of more advanced and valuable data analytics technology, largely built upon the cloud and big data, the data integration strategy should be systemic. This means mapping a path for the data from the source legacy systems, to the views that the data analytics systems should include. What’s more, this data should be in real operational time because data analytics loses value as the data becomes older and out-of-date. We operate a in a real-time world now.
So, the work ahead requires planning to occur at both the conceptual and physical levels to define how data analytics will work for your enterprise. This includes what you need to see, when you need to see it, and then mapping a path for the data back to the business-critical and, typically, legacy systems. Data integration should be first and foremost when planning the strategy, technology, and deployments.
Earlier this month, CNBC.com published its first ever R&D All-Stars: CNBC RQ 50, ranking the top 50 public companies by return on research and development investment. Coming in the top ten, and the first pure software play was Informatica, mentioned as first among great software companies like Google, Amazon, and Salesforce. CNBC.com is referencing a companion article by David Spiegel – Boring stocks that generate R&D heat-and profits. The article made an excellent point: When R&D productivity links R&D spending to corporate revenue growth and market value, it is a better gauge of the productivity of that spending.
Unlike other R&D lists or rankings, the RQ50 was less concerned with pure dollars than what the company actually did with it. The RQ50 measures increase in revenue as it relates to increase in R&D expenditures. Its methodology was provided by Professor Anne Marie Knott, of Washington University in St. Louis, who tracks and studies corporate R&D investment, and has found that the companies that regularly turn R&D into income typically place innovation at the forefront of the corporate mission and have a structure and culture that support it.
Informatica is on the list because its revenue gains between 2006 and 2013 correlate directly with its increased R&D investment over the same period. While the list specifically cites the 2013 figures, the result is due to a systematic and long-term strategic initiative to place innovation at the core of our business plan.
Informatica has innovated broadly across its product spectrum. I can personally speak to one area where it has invested smartly and made significant gains – Informatica Cloud. Informatica decided to make its initial investment in the cloud in 2006 and was early in the market with regards to cloud integration. In fact, back in 2006, very few of today’s well-known SaaS companies were even publicly traded. The most popular SaaS app today, Salesforce.com had revenues of just $309 million in FY2006 compared with over $4 billion in FY2014. Amazon EC2, one of the core services of Amazon Web Services (AWS) itself had only been announced in that year. Apart from EC2, Amazon only had six other services in 2006. In 2014, that number has ballooned to over 30.
In his article about the RQ50, Spiegel talks about how the companies on the list aren’t just listening to what customers want or need now. They’re also challenging themselves to come up with the things the market can use two or ten years into the future. In 2006, Informatica took the same approach with its initial investment in cloud integration.
For us, it started with an observation and then a commitment to the belief that we were at an inflection point with the cloud, and on the cusp of what was going to become a true megatrend that represented a huge opportunity for the integration industry. Informatica assembled a small, agile group made up of strong leaders with varying skills and experience pulled from different areas—sales, engineering, and product management — throughout the company. It also meant throwing away the traditional measures of success and identifying new and more appropriate metrics to benchmark our progress. And finally, it included partnering with like-minded companies like Salesforce and NetSuite initially, and later on with Amazon, and taking our core strength – on-premise data integration technology – and pivoting it into a new direction.
The result was the first iteration of the Informatica Cloud. It leveraged the fruit of our R&D investment – the Vibe Virtual Data Machine – to provide SaaS administrators and line of business IT with the ability to perform lightweight cloud integrations between their on-premise and cloud applications without the involvement of an integration developer. Subsequent work and innovation have continued along the same path, adding tools like drag-and-drop design interfaces and mapping wizards, with the end goal of giving line-of-business (LOB) IT, cloud application administrators and citizen integrators a single platform to perform all the integration patterns they require, on their timeline. Informatica Cloud has consistently delivered 2-3 releases every year, and is now already on Release 20. From originally starting out with Data Replication for Salesforce, the Cloud team added bigger and better functionality such as developing connectivity for over 100 applications and data protocols, opening up our integration services through REST APIs, going beyond integration by incorporating cloud master data management and cloud test data management capabilities, and most recently announcing optimized batch and real-time cloud integration under a single unified platform.
And it goes on to this day, with investments in new innovations and directions, like Informatica Project Springbok. With Project Springbok, we’re duplicating what we did with Informatica Cloud but this time for citizen integrators. We’re using our vast experiences working with customers and building cutting-edge technology IP over the last 20 years and enabling citizen integrators to harmonize data faster for better insights (and hopefully, less late nights writing spreadsheet formulas). What we do after Project Springbok is anyone’s guess, but wherever that is, it will be sure to put us on lists like the RQ 50 for some time to come.
Informatica Cloud Summer ’14 Release Breaks Down Barriers with Unified Data Integration and Application Integration for Real Time and Bulk Patterns
This past week, Informatica Cloud marked an important milestone with the Summer 2014 release of the Informatica Cloud platform. This was the 20th Cloud release, and I am extremely proud of what our team has accomplished.
“SDL’s vision is to help our customers use data insights to create meaningful experiences, regardless of where or how the engagement occurs. It’s multilingual, multichannel and on a global scale. Being able to deliver the right information at the right time to the right customer with Informatica Cloud Summer 2014 is critical to our business and will continue to set us apart from our competition.”
– Paul Harris, Global Business Applications Director, SDL Pic
When I joined Informatica Cloud, I knew that it had the broadest cloud integration portfolio in the marketplace: leading data integration and analytic capabilities for bulk integration, comprehensive cloud master data management and test data management, and over a hundred connectors for cloud apps, enterprise systems and legacy data sources.. all delivered in a self-service design with point-and-click wizards for citizen integrators, without the need for complex and costly manual custom coding.
But, I also learned that our broad portfolio belies another structural advantage: because of Informatica Cloud’s unique, unified platform architecture, it has the ability to surface application (or real time) integration capabilities alongside its data integration capabilities with shared metadata across real time and batch workflows.
With the Summer 2014 release, we’ve brought our application integration capabilities to the forefront. We now provide the most-complete cloud app integration capability in the marketplace. With a design environment that’s meant not for just developers but also line of business IT, now app admins can also build real time process workflows that cut across on-premise and cloud and include built-in human workflows. And with the capability to translate these process workflows instantly into mobile apps for iPhone and Android mobile devices, we’re not just setting ourselves apart but also giving customers the unique capabilities they need for their increasingly mobile employees.
“Schneider’s strategic initiative to improve front-office performance relied on recording and measuring sales person engagement in real time on any mobile device or desktop. The enhanced real time cloud application integration features of Informatica Cloud Summer 2014 makes it all possible and was key to the success of a highly visible and transformative initiative.”
– Mark Nardella, Global Sales Process Director, Schneider Electric SE
With this release, we’re also giving customers the ability to create workflows around data sharing that mix and match batch and real time integration patterns. This is really important. Because unlike the past, where you had to choose between batch and real time, in today’s world of on-premise, cloud-based, transactional and social data, you’re now more than ever having to deal with both real time interactions and the processing of large volumes of data. For example, let’s surmise a typical scenario these days at high-end retail stores. Using a clienteling iPad app, the sales rep looks up bulk purchase history and inventory availability data in SAP, confirms availability and delivery date, and then processes the customer’s order via real time integration with NetSuite. And if you ask any customer, having a single workflow to unify all of that for instant and actionable insights is a huge advantage.
“Our industry demands absolute efficiency, speed and trust when dealing with financial information, and the new cloud application integration feature in the latest release of Informatica Cloud will help us service our customers more effectively by delivering the data they require in a timely fashion. Keeping call-times to a minimum and improving customer satisfaction in real time.”
– Kimberly Jansen, Director CRM, Misys PLC
We’ve also included some exciting new Vibe Integration packages or VIPs. VIPs deliver pre-built business process mappings between front-office and back-office applications. The Summer 2014 release includes new bidirectional VIPs for Siebel to Salesforce and SAP to Salesforce that make it easier for customers to connect their Salesforce with these mission-critical business applications.
And lastly, but not least importantly, the release includes a critical upgrade to our API Framework that provides the Informatica Cloud iPaaS end-to-end support for connectivity to any company’s internal or external APIs. With the newly available API creation, definition and consumption patterns, developers or citizen integrators can now easily expose integrations as APIs and users can consume them via integration workflows or apps, without the need for any additional custom code.
The features and capabilities released this summer are available to all existing Informatica Cloud customers, and everyone else through our free 30-day trial offer.
Now in its third year (2012, 2013), The State of Salesforce Annual Review continues to be the most comprehensive report on the Salesforce ecosystem. Based on the data from over 1,000 global Salesforce users, this report highlights how companies are using the Salesforce platform, where resources are being allocated, and where industry hype meets reality. Over the past three years, the report has evolved much like the technology, shifting and transforming to address recent advancements, and well as tracking longitudinal trends in the space.
We’ve found that key integration partners like Informatica Cloud continue to grow in importance within the Salesforce ecosystem. Beyond the core platform offerings from Salesforce, third-party apps and integration technologies have received considerable attention as companies look to extend the value of their initial investments and unite systems. The need to sync multiple platforms and applications is an emerging need in the Salesforce ecosystem—which will be highlighted in the 2014 report.
As Salesforce usage expands, so does our approach to survey execution. In line with this evolution, here’s what we’ve learned over the last three years from data collection:
Functions, Departments Make a Difference
Sales, Marketing, IT, and Service all have their own needs and pain points. As Salesforce moves quickly across the enterprise, we want to recognize the values, priorities, and investments by each department. Not only are the primary clouds for each function at different stages of maturity, but the ways in which each department uses their cloud are unique. We anticipate discovery of how enterprises are collaborating across functions and clouds.
Focus on Region
As our international data set continues to grow we are investing in regionalized reports for the US, UK, France, and Australia. While we saw indications of differences between each region in last year’s survey, they were not statistically significant.
Customer Engagement is a Top Priority
Everyone agrees that customer engagement is important, but what are companies actually doing about it? A section on predictive analytics and questions about engagement specific to departments has been included in this year’s survey. We suspect that the recent trend of companies empowering employees with a combination of data and mobile will be validated in the survey results.
Variation Across Industries
As an added bonus, we will build a report targeting specific insights from the Financial Services industry.
We Need Your Help
Our dataset depends on input from Salesforce users spanning all functions, roles, industries, and regions. Every response matters. Please take 15 minutes to share your Salesforce experiences, and you will receive a personalized report, comparing your responses to the aggregate survey results.
Getting started with Cloud Data Warehousing using Amazon Redshift is now easier than ever, thanks to the Informatica Cloud’s 60-day trial for Amazon Redshift. Now, anyone can easily and quickly move data from any on-premise, cloud, Big Data, or relational data sources into Amazon Redshift without writing a single line of code and without being a data integration expert. You can use Informatica Cloud’s six-step wizard to quickly replicate your data or use the productivity-enhancing cloud integration designer to tackle more advanced use cases, such as combining multiple data sources into one Amazon Redshift table. Existing Informatica PowerCenter users can use Informatica Cloud and Amazon Redshift to extend an existing data warehouse with through an affordable and scalable approach. If you are currently exploring self-service business intelligence solutions such as Birst, Tableau, or Microstrategy, the combination of Redshift and Informatica Cloud makes it incredibly easy to prepare the data for analytics by any BI solution.
To get started, execute the following steps:
- Go to http://informaticacloud.com/cloud-trial-for-redshift and click on the ‘Sign Up Now’ link
- You’ll be taken to the Informatica Marketplace listing for the Amazon Redshift trial. Sign up for a Marketplace account if you don’t already have one, and then click on the ‘Start Free Trial Now’ button
- You’ll then be prompted to login with your Informatica Cloud account. If you do not have an Informatica Cloud username and password, register one by clicking the appropriate link and fill in the required details
- Once you finish registration and obtain your login details, download the Vibe ™ Secure Agent to your Amazon EC2 virtual machine (or to a local Windows or Linux instance), and ensure that it can access your Amazon S3 bucket and Amazon Redshift cluster.
- Ensure that your S3 bucket, and Redshift cluster are both in the same availability zone
- To start using the Informatica Cloud connector for Amazon Redshift, create a connection to your Amazon Redshift nodes by providing your AWS Access Key ID and Secret Access Key, specifying your cluster details, and obtaining your JDBC URL string.
You are now ready to begin moving data to and from Amazon Redshift by creating your first Data Synchronization task (available under Applications). Pick a source, pick your Redshift target, map the fields, and you’re done!
The value of using Informatica Cloud to load data into Amazon Redshift is the ability of the application to move massive amounts of data in parallel. The Informatica engine optimizes by moving processing close to where the data is using push-down technology. Unlike other data integration solutions for Redshift that perform batch processing using an XML engine which is inherently slow when processing large data volumes and don’t have multitenant architectures that scale well, Informatica Cloud processes over 2 billion transactions every day.
Amazon Redshift has brought agility, scalability, and affordability to petabyte-scale data warehousing, and Informatica Cloud has made it easy to transfer all your structured and unstructured data into Redshift so you can focus on getting data insights today, not weeks from now.
Once upon a time, database schema changes were rare and handled with scrutiny. The stability of source data led to the development of the traditional Data Integration model. In this traditional model, a developer pulled a fixed number of source fields into an integration, transformed these fields, and then mapped the data into appropriate target fields.
The world of data has profoundly changed. Today’s Cloud applications allow an administrator to add custom fields to an object at a moment’s notice. Because source data is increasingly malleable, the traditional Data Integration model is no longer optimal. The Data Integration model must evolve.
Today’s integrations must dynamically adapt to ever-changing environments. (Webinar HERE)
To meet these demands, Informatica has built the Informatica Cloud Mapping Designer. The Mapping Designer provides power and adaptability to integrations through the “link rules” and “incoming field rules” features. Integration developers no longer need to deal with fields on a one-by-one basis. Cloud Designer allows the integration developer to specify a set of dynamic “rules” that tell the mapping how fields need to be handled.
For example, the default rule is “Include all fields”, which is both simple and powerful. The “all fields” rule dynamically resolves to bring in as many fields as exist at the source at run time. Regardless of how many new fields the application developer or database administrator may have thrown in to the source after the integration was developed, this simple rule can bring in all the new fields into the integration dynamically. This exponentially increases developer productivity, as the integration developer is not making modifications just to keep up with changes to the integration endpoints. Instead, the integration is “future proofed”.
Link rules can be defined in combination using both “includes” and “excludes” criteria. The rules can be of four types:
- Include or Exclude All fields
- Include or Exclude Fields of a particular datatype (example: String, numeric, decimal, datetime, blob etc)
- Include or Exclude Fields that fit a name pattern (example: any field that ends with “_c” or any field that starts with “Shipping_”)
- Include or Exclude Fields by a particular name (example: “Id”, “Name” etc)
Any combination of the link rules can be put together to create sophisticated dynamic rules for fields to flow.
Each transformation in the integration can specify the set of rules that determine what fields flow into that particular transformation. For example, if I need all custom fields from a Salesforce source to flow into a target, I would simply “Include fields by name pattern : suffixed with ‘_c’” – which is the naming convention for custom field names in Salesforce. In another example, If I need to perform standardization of date formats for all datetime fields in an expression, I can define a rule to “Include fields by datatype – datetime”.
The dynamic nature of the link rules is what empowers a mapping created in Informatica Cloud Designer to be easily converted into a highly reusable integration template through parameterization.
For example, the entire source object can be parameterized and the integration developer may focus on the core integration logic without having to worry about individual fields. For example I can build an integration for bringing data into a slowly changing dimension table in a datawarehouse and this integration can apply to any source object. When the integration is executed by substituting different source objects for the source parameter, the integration would work as expected since the logical rules can dynamically bring in the fields regardless of what the source object structure is. Now all of a sudden, an integration developer is only required to build one reusable integration template for replicating multiple objects to the datawarehouse and NOT dozens or even hundreds of such repeated integration mappings. Needless to say, maintenance is hugely optimized.
With the power of logically defining field propagation through an integration combined with the ability to parameterize just about any part of the integration logic, the Cloud Mapping Designer provides a unique and powerful platform for developing reusable end to end integration solutions (such as Opportunity to Order, Accounts load to Salesforce, SAP product catalog to Salesforce, File load to Amazon redshift etc). Such prebuilt end-to-end solutions or VIPs (Vibe Integration Packages) can be easily customized by any consuming customer to adapt to their unique environments and business needs by tweaking only certain configurations but largely reusing the core integration logic.
What could be better than building integrations… building far fewer integrations that are reusable and self-adapting
To learn more, join the upcoming Cloud Spring release Webinar on Thursday, March 13.
Leo Eweani makes the case that the data tsunami is coming. “Businesses are scrambling to respond and spending accordingly. Demand for data analysts is up by 92%; 25% of IT budgets are spent on the data integration projects required to access the value locked up in this data “ore” – it certainly seems that enterprise is doing The Right Thing – but is it?”
Data is exploding within most enterprises. However, most enterprises have no clue how to manage this data effectively. While you would think that an investment in data integration would be an area of focus, many enterprises don’t have a great track record in making data integration work. “Scratch the surface, and it emerges that 83% of IT staff expect there to be no ROI at all on data integration projects and that they are notorious for being late, over-budget and incredibly risky.”
The core message from me is that enterprises need to ‘up their game’ when it comes to data integration. This recommendation is based upon the amount of data growth we’ve already experienced, and will experience in the near future. Indeed, a “data tsunami” is on the horizon, and most enterprises are ill prepared for it.
So, how do you get prepared? While many would say it’s all about buying anything and everything, when it comes to big data technology, the best approach is to splurge on planning. This means defining exactly what data assets are in place now, and will be in place in the future, and how they should or will be leveraged.
To face the forthcoming wave of data, certain planning aspects and questions about data integration rise to the top:
Performance, including data latency. Or, how quickly does the data need to flow from point or points A to point or points B? As the volume of data quickly rises, the data integration engines have got to keep up.
Data security and governance. Or, how will the data be protected both at-rest and in-flight, and how will the data be managed in terms of controls on use and change?
Abstraction, and removing data complexity. Or, how will the enterprise remap and re-purpose key enterprise data that may not currently exist in a well-defined and functional structure?
Integration with cloud-based data. Or, how will the enterprise link existing enterprise data assets with those that exist on remote cloud platforms?
While this may seem like a complex and risky process, think through the problems, leverage the right technology, and you can remove the risk and complexity. The enterprises that seem to fail at data integration do not follow that advice.
I suspect the explosion of data to be the biggest challenge enterprise IT will face in many years. While a few will take advantage of their data, most will struggle, at least initially. Which route will you take?
Hosting Big Data applications in the cloud has compelling advantages. Scale doesn’t become as overwhelming an issue as it is within on-premise systems. IT will no longer feel compelled to throw more disks at burgeoning storage requirements, and performance becomes the contractual obligation of someone else outside the organization.
Cloud may help clear up some of the costlier and thornier problems of attempting to manage Big Data environments, but it also creates some new issues. As Ron Exler of Saugatuck Technology recently pointed out in a new report, cloud-based solutions “can be quickly configured to address some big data business needs, enabling outsourcing and potentially faster implementations.” However, he adds, employing the cloud also brings some risks as well.
Data security is one major risk area, and I could write many posts on this. But management issues also present other challenges. Too many organizations see cloud as an cure-all for their application and data management ills, but broken processes are never fixed when new technology is applied to them. There are also plenty of risks with the misappropriation of big data, and the cloud won’t make these risks go away. Exler lists some of the risks that stem from over-reliance on cloud technology, from the late delivery of business reports to the delivery of incorrect business information, resulting in decisions based on incorrect source data. Sound familiar? The gremlins that have haunted data analytic and management for years simply won’t disappear behind a cloud.
Exler makes three recommendations for moving big data into cloud environments – note that the solutions he proposes have nothing to do with technology, and everything to do with management:
1) Analyze the growth trajectory of your data and your business. Typically, organizations will have a lot of different moving parts and interfaces. And, as the business grows and changes, it will be constantly adding new data sources. As Exler notes, “processing integration or hand off points in such piecemeal approaches represent high risk to data in the chain of possession – from collection points to raw data to data edits to data combination to data warehouse to analytics engine to viewing applications on multiple platforms.” Business growth and future requirements should be analyzed and modeled to make sure cloud engagements will be able “to provide adequate system performance, availability, and scalability to account for the projected business expansion,” he states.
2) Address data quality issues as close to the source as possible. Because both cloud and big data environments have so many moving parts, “finding the source of a data problem can be a significant challenge,” Exler warns. “Finding problems upstream in the data flow prevent time-consuming and expensive reprocessing that could be needed should errors be discovered downstream.” Such quality issues have a substantial business cost as well. When data errors are found, it becomes “an expensive company-wide fire drill to correct the data,” he says.
3) Build your project management, teamwork and communication skills. Because big data and cloud projects involve so many people and components from across the enterprise, requiring coordination and interaction between various specialists, subject matter experts, vendors, and outsourcing partners. “This coordination is not simple,” Exler warns. “Each group involved likely has different sets of terminology, work habits, communications methods, and documentation standards. Each group also has different priorities; oftentimes such new projects are delegated to lower priority for supporting groups.” Project managers must be leaders and understand the value of open and regular communications.
Since the advent of middleware technology in the mid-1990’s, data integration has been primarily an IT-lead technical problem. Business leaders had their hands full focusing on their individual silos and were happy to delegate the complex task of integrating enterprise data and creating one version of the truth to IT. The problem is that there is now too much data that is highly fragmented across myriad internal systems, customer/supplier systems, cloud applications, mobile devices and automatic sensors. Traditional IT-lead approaches whereby a project is launched involving dozens (or hundreds) of staff to address every new opportunity are just too slow. (more…)