The New Math for Analytic Data

There is an old Business Intelligence Axiom that has always held true:

BI Axiom: Dirty data in -> Dirty data out

The New Math for Analytic Data
The New Math for Analytic Data
When trying to perform analytics or reporting over a large data set it has been proven over and over again that if the underlying data that you are analyzing is dirty then the reports and dashboards that you build on top of it will be confusing, inconsistent, and in the worst case incorrect.

Over the last 20 years BI project teams have been able to account for this Axiom by employing the following formula:

OLD Formula: ETL + BI= CleanAnalytics

By implementing an enterprise class ETL (Extract, Transform, & Load)/ data integration offering in addition to deploying a business intelligence solution allowed users to account for the issues that they had with their dirty data to help ensure that the data that they were performing Analytics on was clean and correct.

What’s new in BI…cloud analytics

Gartner defines the six key elements of analytics as data sources, data models, processing applications, computing power, analytic models and sharing or storage of results. In its view, any analytics initiative “in which one or more of these elements is implemented in the cloud” qualifies as cloud analytics.
Over the last 2 years Cloud analytics has grown very quickly because of the following reasons:

  • Agility & Elasticity
  • Cost
  • Time to Value

Cloud analytic solutions no longer require users to spend the time to agree upon and define a star or snowflake schema (ROLAP/MOLAP) which are common in most enterprise data warehouses (EDW). This has allowed BI teams to follow an agile development approach to Cloud analytics where they can engage with their stakeholders and evolve their solutions very quickly.

The cost for Cloud analytics is more appealing to customers because they are typically offered via pay-as-you-go subscription as opposed to a large upfront capital expenditure for most tradition BI solutions.

The modern design and technology found in Cloud analytic solutions has enabled them to be a lot easier to use than traditional analytic tools while still providing as much if not more functionality and performance. This in addition to taking an agile approach to their projects has allowed BI teams to drastically lower the time to value for their projects.
Many companies large and small have shown tremendous success while implementing their initial set of BI projects using Cloud analytic solutions. But analytic projects are currently being driven by the need to analyze not just traditional on-premise transactional data but also data in the Cloud, IOT and real-time/streaming data sources, and semi and unstructured data such as mobile application logs.

With these new requirements, in order to get the full value of Analytics projects companies need to be able to deal with a new set of data challenges – these challenges require an evolution of data integration and transformation technologies.

Some BI and Cloud analytic vendors are including basic data integration and transformation functionality in their offerings, but as I mentioned earlier, the modern realities of data variety and velocity (e.g. IoT, cloud apps, mobile, streaming) requires more than just basic data integration “starter kits” – in fact, light-weight functionality that overpromises often burdens users with mounting dirty data that doesn’t scale, and creates a security battle between IT & their business stakeholders.

Companies need the use of a modern data integration solution that provides the data capabilities for today’s IoT, cloud, mobile, streaming world, an evolved ETL, built by expert, focused data integration providers. As a result, every company using cloud analytics needs a new formula for clean, actionable analytics data for their BI initiatives:

NEW Formula: E.PTQM.TL + BI= CleanAnalytics

So, what’s E.PTQM you ask?

E.PTQM is the new lingua franca of data integration for cloud analytics that addresses the following data gaps that companies face today:

  • Cannot connect to and Extract data out of all of the required on-premise and Cloud data sources
  • No ability to Profile data which forces users to load all of it into the analytic solution
    • But that’s too late, business users doesn’t know what to do with the data
      • There are 10 revenue fields, no one knows what’s the calculation/source fields
      • Finance sees revenue as $10M while operations says sales orders is $7M
    • Don’t need all of the data (e.g. don’t want data for a certain region/product/fiscal period)
    • What does dept code SF625 mean and how many rows have that value?
  • Need data Prep capabilities including address verification and cleansing
    • Users slice and dice reports by State, Zip, and Country and require consistent and clean values
  • Need Transformation functionality including data masking to meet enterprise data security standards
    • Mask sensitive values such as name, SSN, credit card numbers, etc…
  • No data Quality capabilities including de-duplication.
  • No Master data management functionality to ensure consistent reporting across key business entities

This new formula is now the standard prerequisite to prevent the BI Axiom: Dirty data in -> Dirty data out: NEW Formula: E.PTQM.TL + BI= CleanAnalytics

So where can you find these new, improved data integration capabilities that are key for the success of your cloud analytics projects?

Informatica specializes in integration and data management solutions for analytics. Integration is a constantly evolving, highly sophisticated and critical building block of your analytics strategy. Just as you leverage data visualization expertise from a company like Tableau, you need modern data integration and data management from an expert company like Informatica – so you can get the clean analytics data that you need to get insights and make the right decisions – and get the ROI that you want from your cloud analytics solution.

Better Analytics Through Better Data for Tableau

Informatica’s solutions for Tableau provide self-service data access, data integration, data cleansing and blending and deliver rapid time-to-value for Tableau customers.

The new Informatica offerings provide a turnkey data management solution which includes data integration, data preparation and out-of-the box visual templates for Tableau customers across the organization. Additionally, the solution includes zero-training, easy adoption, self-service tools for end users. Informatica provides the industry’s best integration backbone to enterprise IT, with the data governance tools needed to deploy Tableau across the enterprise on a foundation of trusted data with full data lineage and more.

Available immediately, the new Informatica for Tableau bundle has an entry price starting at just $100/user/month on a subscription basis. The new bundle delivers a strong combination of self-service data integration tools for Tableau customers, powered by the industry’s leading cloud and on-premise integration suite, you can get started with the Informatica for Tableau bundle here.

Learn more about what Informatica can do for Tableau customers:

  • Watch more about Informatica for Tableau here.
  • Learn more about Informatica’s presence at Tableau Conference 2015 here.

Comments