Mapping the Customer Journey with an Iterative Approach to Building a Marketing Data Lake

Mapping the Customer Journey with an Iterative Approach to Building a Marketing Data Lake

Transamerica Presents at Informatica World 2016.

Informatica World 2016 was in San Francisco this year, and the company whose iconic pyramid marks the skyline led this key session describing the strategy and hard work behind their successful EMAP (Enterprise Marketing and Analytics Program) customer relationship platform. Vishal Bamba, VP Innovation & Architecture and Rocky Tiwari, Manager of Innovation & Architecture, took the stage together and walked the audience through their process.

In an increasingly competitive insurance industry, insurance providers are becoming strategically about understanding customer relationships – seeking a 360 view of customers to inform marketing activities, up-sell and cross-sell. The goal for EMAP was to serve up this 360° view of customers, for marketing and planning analysis – to mine customer data for insights which will fuel targeted marketing and service efforts.

The solution architecture centered around Informatica Big Data Management, Cloudera Enterprise, and Tableau for data visualization. They built the system to enable an iterative approach, where either whole or partial datasets could be ingested, producing added business value immediately, while gradually enriching profiles and building a more comprehensive enterprise solution over time.

Data quality and governance were an important part of the plan—as Bamba put it, “We knew we didn’t want to create a data swamp.”

Rocky took over for the second half, diving in deep and granular on Transamerica’s record matching and linking strategy; Informatica Big Data Relationship Manager supports this two-step process that involves grouping (key generation, candidate selection) and matching (rule matching, score & thresholds). He was able to show how a system of rules and weights allows Transamerica to intelligently short partly overlapping, confusing, or incomplete records to decide whether Jon Alexander and Jonathan Alexander Jr are the same person, for example.

He expressed excitement and apprehension about the scope of the challenge for data architects. “We have data coming in from many sources,” he said, “and we don’t always have a common understanding of data definitions across our enterprise. Before you start creating your rules you have to invest time in understanding all your data.” He also advised using Informatica Analyst (“a very good tool for profiling”), as well as keeping your partition size small enough, so it doesn’t kill performance, and resisting the urge to try stretching design to include trivial “edge cases”—it can cause problems elsewhere. This pragmatic approach is good for high growth organizations who can’t afford to invest in armies of specialized Hadoop developers. Judging by the detailed questions in the closing Q&A, the audience appreciated their newfound understanding of the Transamerica experience.