Big Data Management – The Path Towards Repeatable Big Data Success
Over the past 4 weeks, I’ve had the opportunity to attend our regional Informatica Day events in Dallas, Washington (DC) and Paris, France. It gave me the opportunity to talk to companies in various industries and listen to them discuss their efforts to unlock the value of their organization’s data.
When asked the question – “How important is data?” everyone was in fierce agreement. To a person, commercial enterprises, public sector and our international speakers, data is vital. At the same time, everyone expressed the need for data processes to move from its current ad hoc and siloed nature into broad, enterprise-wide best practices; companies want to be data-ready for anything!
Most companies pointed out that they were making progress on a number of fronts. Business leaders are beginning to understand that they own the data. Even if IT owns the infrastructure, business drives the stewardship processes around the data itself. But in a number of cases the adoption of big data technologies have business leaders fearing of a major disruption to the progress that was made.
The advent of these next-gen data approaches and the new analytical use cases it promises to unlock are resulting in a mad dash to purchase new technologies – to try to recruit from the very small pond of skilled data scientists who might actually know what to do with these technologies. The result? Numerous projects are stuck in the scientific experimentation mode, lacking meaningful ROI.
Hence, many of the conversations we had at our regional events revolved around one key problem: How do we embrace these new technologies and approaches and how do we make this repeatable – quickly and reliably?
Three emerging big data challenges
There are three categories of issues rapidly undermining our ability to broadly and accurately unlock value from big data environments:
- Ingesting and integrating the data. The challenge is that most of the science projects are developed using coding in Pig or Hive or related “programming”-like environments. It relies heavily on scarce, highly skilled people. As a result, we forego certain data sets or we delay projects (i.e. we delay return on investment).
- Governance and quality. When first-time insights are generated from the project, everyone is amazed. The second time around, the comparison to other known (and perhaps more trusted) data sources happens. Why is it different? Where do you get your data? What was done to it to make it different?
- Data security. A strength of the next-gen data world is its structure and volume independence. We can load everything in the database. In the process, we move data out of its original context – which in most cases also represent where our security is applied. We build security around perimeters of things like apps, databases and devices. But how do we even know what inside our massive and diverse data set represent sensitive data? And furthermore, how do we secure that?
Informatica Big Data Management: The path towards repeatable big data success
There are plenty of new things to consider when embarking on big data initiatives, but there is also much to learn from what we have done for the past 15 years in traditional data use cases. While the new Informatica Big Data Management solution is re-architected from the core to deal with the new data types, volumes and velocity, it also extensively reuse the existing capabilities of the Intelligent Data Platform.
- Streamlined, high performance data ingestion and transformation. A visual design paradigm combined with out-of-the-box templates, connectors, parsers and transformations allows unmatched developer productivity and accuracy. Coupled with the design-time abstraction, intelligent optimizers in the platform will dynamically optimize and select the most optimum execution paths in your environment, ensuring highest performance.
- End-to-end data governance and quality. Visual data profiling and built-in collaboration and workflow allow data stewards to rapidly define and agree upon key business terms and rules. Add to that enhanced extensive metadata management, business glossary, visual rule design, data quality and the ability to do matching a discover relationships in your big data environment and you have a holistic data governance capability natively deployed on your big data platform.
- Big data security. Rather than wait for data to be breached, Informatica’s unique ability to discover, profile and classify data before you move or access it enable them to effectively define policies and rules to safeguard your data, regardless of where it is located or where it might be copied. Our Secure@Source product adds a visual analytical view into sensitive data in your environment, its classification, whether it has been secured via a masking rule, and what your potential risk exposure might be based on usage statistics.
The re-architected Big Data Management solution ultimately results in an end-to-end management platform that delivers unmatched ROI for customers, allowing to them to reuse:
- Existing infrastructure
- Existing skills and people, or easily recruit from a massive pool of global professionals trained on Informatica tools
- Existing work
For more information, please REGISTER for the Big Data Management Virtual Launch Event now!